Data Science Ethics: my initial thoughts

I had two main thoughts about this: self regulation by the data science profession, and data literacy. The promise of big data and artificial intelligence is at an all time high, but by no means at its peak. The availability of data to mine is growing exponentially. And yet the data science community is still relatively […]

Exploration of Tree Based Gradient Boosting Models to classify terrorism events as Suicide Attacks

Using Tree Based Gradient Boosting Models to classify terrorism events as Suicide Attacks Tracy Keys 13 June 2017 Background My team Gonzo at UTS used the Global Terrorism Database (GTD) to explore whether distinct features of terrorism events could predict the ABC’s online reaction to them. We did this through web scraping the ABC’s Twitter […]

According to Mark Zuckerberg, Facebook is not a media company

According to Mark Zuckerberg, CEO , Facebook, the world’s largest social media platform[i] is not a media company[ii]. Zuckerberg explained in August 2016: “No, we are a tech company, not a media company…..We build the tools, we do not produce any content..”[iii] One of those tools is the Facebook News Feed, which provides every one […]

Using topicmodels package for analysis of topics in texts

My vignette is about text mining and analysis, utilising the tm and topicmodels packages in R and Latent Dirichlet Allocation, to work out what the documents are written about without having to read them all! The vignette shows you how to create a Document-Term Matrix, then uses LDA to work out what key themes are […]

Is there a sexist data crisis? Hardly a crisis, but still important to resolve

In our session on Tuesday Simon K, as an aside, suggested we google “is there a sexist data crisis. “ I did, (here is a BBC article with that exact title http://www.bbc.com/news/magazine-36314061) but it got me thinking, this is hardly a crisis and hardly new. Women are underrepresented in many important things. For example, did […]

Missing data codification OR how to capture that slap across the face

Last week we read about missing data and how to plan for it. I found it super useful and applicable to our Quantified Self work- if we had codified our missing data I would have had less chasing up to do, and we would have had some insights into the boundaries of what we were […]

The journey to Ithaca starts with a single step…

I felt like I attended the University of Technology Sydney’s Masters of Data Science and Innovation (MDSI) information session on a whim. But with hindsight the dots join together rather nicely, and the decision to enrol feels so right! I am going to try out every facet of this experience and jump in with both […]