Random Forest Tutorial: Predicting Crime in San Francisco


Announcement: Annalyzin is now Algobeans! We are also creating a new mailing list that helps us send better emails to you. Click here to update your subscription! This will move you to our new mailing list and remove you from the old one.

Can several wrongs make a right? While it may seem counter-intuitive, this is possible, sometimes even preferable, in designing predictive models for complex problems such as crime prediction.

The Problem

In the film Minority Report, police officers were able to predict and prevent murders before they happened. While current technology is nowhere near, predictive policing has been implemented in some cities to identify locations with high crime. Location-based crime records could be coupled with other data sources, such as income levels of residents, or even the weather, to forecast crime occurrence. In this chapter we build a simple random forest to forecast crime in San…

View original post 1,255 more words

Decision Trees Tutorial


Would you survive a disaster?

Certain groups of people, such as women and children, might be entitled to receiving help first, granting them a higher chance of survival. Knowing whether you belong to one of these privileged groups would help predict whether you would make it out alive. To identify which groups have higher survival rates, we can use decision trees.

While we forecast the rate of survival here, decision trees are used in a a wide range of applications. In the business setting, it can be used to define customer profiles or to predict who would resign.


A decision tree leads you to a prediction by asking a series of questions on whether you belong to certain groups (see Figure 1). Each question must only have 2 possible responses, such as “yes” versus “no”. You start at the top question, called the root node, then move through the tree branches according to…

View original post 746 more words

Downgrade pip

This saved my day today! 🙂

THA Pipeline

Wow, the new pip 8.1.2 is so horribly broken that it gives us all kinds of problems in our daily work, especially in combination with devpi. No idea how that could ever get released. For everyone encountering issues with it (failing to parse requirement strings, failing to install) I recommend the following line:

It will force a downgrade to 8.1.1 and you should be good to go again.

View original post

Where Will Your Country Stand in World War III?


In the recent Panama Papers scandal, journalists analyzed 11.5 million documents using network graphs to trace the use of offshore tax structures. In this chapter, we use a network graph technique called Social Network Analysis (SNA) to map weapons transfer between countries. By analyzing bilateral weapons trade, a network of multilateral ties can be distilled, providing insights into the complex arena of international politics.

SNA is based on mathematics and computer science concepts, and is applied in many social science disciplines. It analyzes relationships between individuals, uncovering social circles and influential people within a network. For instance, it can identify the main character in Game of Thrones, a popular television series. SNA is also used in government intelligence to map out crime rings and terrorist cells. Apart from people, other entities such as objects can be mapped in a network as well.

Introduction to Graphs

In SNA, network structures…

View original post 1,031 more words

Association Rules and the Apriori Algorithm


The Problem

When we go grocery shopping, we often have a standard list of things to buy. Each shopper has a distinctive list, depending on one’s needs and preferences. A housewife might buy healthy ingredients for a family dinner, while a bachelor might buy beer and chips. Understanding these buying patterns can help to increase sales in several ways. If there is a pair of items, X and Y, that are frequently bought together:

Tesco Pic (edit).png Product placement in Tesco, UK.

  • Both X and Y can be placed on the same shelf, so that buyers of one item would be prompted to buy the other.
  • Promotional discounts could be applied to just one out of the two items.
  • Advertisements on X could be targeted at buyers who purchase Y.
  • X and Y could be combined into a new product, such as having Y in flavors of X.

While we may know that…

View original post 1,423 more words