I’m an NHS data analyst based in Inverness, Scotland.

I don’t think “data analyst” accurately conveys what I do on a daily basis, Hence, you’ll find me on Twitter as @HighlandDataSci.

I don’t feel entirely comfortable with the term “data scientist” - but I do a lot of importing/merging & cleaning of data sets, writing code, data visualisation, and explaining results to others. In short - I’m getting there!

I began using R for data visualisation, and, over time, it has become integral to my workflow. It’s just so flexible, and the more you get immersed in the R world, the more you realise what you can do with it.

The main packages I use are ggplot2 and qicharts for plotting, dplyr for data manipulation and RODBC for linking to MS SQL Server.

I’m also getting to grips with data.table, which is both fast and reduces the amount of code I need to write.

I’ve also used R for:

I’m also interested in using R for time series (using forecast) and predictive analytics.

In addition to R, and driven by the large scale data requirements of my role, I’ve also developed skills in:

  • SQL, including complex T-SQL queries and SQL Server Integration Services (SSIS)
  • Exensive experience of using data for improvement, particularly Run Charts and Statistical Process Control
  • SQL Server Reporting Services
  • QlikView

In addition:

  • Excel Dashboards / Advanced Excel
  • VBA, including custom functions and manipulating other MS Office software from within Excel

About this blog:

This is a forum for me to show some R work:

  • Anything I produce for demonstration at our local R user group (InveRness RUG)
  • Notes to myself on things I’ve learned (the hard way) at work
  • Chart makeovers using R
  • Exploring techniques and technology that I don’t currently use in my role

Code examples

Most of the code I write will appear on my github as a gist or repository.

My code is pretty well commented, so I don’t go into too much depth in these posts.

Some posts will be related to things I am learning about for my own development, so they may not go into a lot of detail but I aim to update these as my learning increases.

Please check the session info on the code scripts for details of loaded packages and dependencies.

Disclaimer

My R skills are not at the level of the awesome Hadley et al, at least, not yet.
It does bug me when I see an interesting blog post and find that the code does not work when you try to use it.
I am confident that if you run my code as is it will work. There may be a slicker way of doing things, but there will be an end result.

Finally

At the moment I’m not posting as frequently as I would like - the challenge of work and a young family.

I’ve disabled comments / disqus on the site because of their effect on page load times. However please do get in touch if you have any comments.

You’ll find my contact details at the bottom of the page and also @HighlandDataSci

John