Simple Visualization of Poverty Overview in Indonesia using Matplotlib

Aron Akhmad
6 min readNov 26, 2020
Photo by Zeyn Afuang on Unsplash

In data analytics, plotting is immensely important as it gives us insights from the data. There are various tools out there available for plotting. However, as python has been becoming the hottest programming language contemporarily — especially among data scientists, so I’m going to show you how to plot your data using the most eminent python library for data plotting, matplotlib. Actually, at first I just wanted to make an article about Indonesia’s poverty year-by-year in visualization, but since I’ve been posting about python tutorials before, so why don’t I share the code too, right? Teehee.

Matplotlib is a prominent python library used for data plotting and it is among the most used ones. It is known for its reliability, convenience, and simplicity for plotting, though it might be not the prettiest one. You could plot your data just by typing a few lines of code and taadaa! you’ll get your data visualized. In this article, you will learn simple plotting using matplotlib as well as simple analysis (or maybe graph reading lmao) of the graph. Please note that the code in this article is just an example aimed to give you an overview of the matplotlib use for data plotting. You may adjust the code as you desire depending on how you want your data to be plotted. The next paragraphs will show you an example of data plotting using Indonesia’s poverty overview year-by-year data.

Import Modules

First things first, you need to import several modules. There are 3 modules used here; pandas, NumPy, and matplotlib. Pandas will be used for importing and utilizing the data file, NumPy will be used for generating array, and matplotlib will be used for data plotting, obviously. Teehee.

import pandas as pdimport numpy as npimport matplotlib.pyplot as plt%matplotlib inline #%matplotlib inline sets the backend of matplotlib to the ‘inline’ backendplt.style.use(‘ggplot’) #for better style of graphpd.set_option(‘float_format’, ‘{:.2f}’.format)

Import Data

Bring the data that you desire to plot into the code. We’re using pandas to import the data. In this code below, we open an excel file, so you may adjust the code depending on the type of file you open. Note that xls.parse(0) meaning we parse the first sheet on the excel file, and again you may adjust the code according to what you desire. tail up the data to print the last five data on the file to take a quick sight at the data composition.

xls = pd.ExcelFile(r’indonesia_poverty.xlsx’)data = xls.parse(0)data = data.dropna()data.tail()

output:

Data Plotting

Now that the data is already set, we’re about to plot the data immediately. In this particular case, we don’t need to clean up the data since the data is perfectly ready to use. In other cases, you may need to clean the data first before ultimately plotting it. The code below will plot the comparison of the total population living under the poverty line in Indonesia. From the output, we can see that the plot portraying a fluctuated graph meaning the number is going up and down for the past 9 years but more into climbdown in Q1 2017 up to Q3 2019. The graph went up again in Q1 2020, and it’s most probably affected by the COVID-19 pandemic.

def text_value(x, y):for i,j in zip(x, y):plt.annotate(str(“{:.2f}”.format(j)), (i,j))fig, ax = plt.subplots(figsize=(20,7))x_axis = data[‘Tahun’].iloc[-19:]y1, y2, y3 = data[‘Jumlah Penduduk Miskin (Kota)’].iloc[-19:], data[‘Jumlah Penduduk Miskin (Desa)’].iloc[-19:], data[‘Jumlah Penduduk Miskin (Total)’].iloc[-19:]plt.plot(x_axis, y1, linestyle=’dashed’, marker=’o’, markersize=12)text_value(x_axis, y1)plt.plot(x_axis, y2, linestyle=’dashed’, marker=’o’, markersize=12)text_value(x_axis, y2)plt.plot(x_axis, y3, linestyle=’dashed’, marker=’o’, markersize=12)text_value(x_axis, y3)ax.set_ylabel(‘Total Population Living in Poverty (in million)’)plt.title(‘Comparison of Total Population Living under Poverty Line in Indonesia’)plt.legend([‘Total Urban Population Living in Poverty’, ‘Total Village Population Living in Poverty’, ‘Overall Total Population Living in Poverty’])

output:

The second plot portrays pretty much the same as the previous one but in percent. Here, I use a vertical bar plot so that we can compare the 3 parameters side-by-side. The graph depicted the same idea as the first graph.

fig, ax = plt.subplots(figsize=(20,7))x_axis = np.arange(19)y1, y2, y3 = data[‘Persentase Kemiskinan (Kota)’].iloc[-19:], data[‘Persentase Kemiskinan (Desa)’].iloc[-19:], data[‘Persentase Kemiskinan (Total)’].iloc[-19:]plt.bar(x_axis, y1, width=0.2, align=’center’)plt.bar(x_axis+0.2, y2, width=0.2, align=’center’)plt.bar(x_axis+2*0.2, y3, width=0.2, align=’center’)ax.set_xticks(x_axis+0.2)ax.set_xticklabels( [i for i in data[‘Tahun’].iloc[-19:]] )ax.set_ylabel(‘Poverty Percentage (%)’)plt.title(‘Poverty Percentage Comparison in Indonesia’)plt.legend([‘Urban Poverty Percentage’, ‘Village Poverty Percentage’, ‘Overall Poverty Percentage’])

output:

Below is the plot of poverty line comparison in urban and village areas. Just so you know, the poverty line was divided into 2; urban and village up until 2017. I use the horizontal bar plot as it will be easier to compare and get insight from the parameters. From the graph, we can conclude that the poverty line threshold is constantly increasing year by year. It is expected as inflation affect the value of the money as well as the groceries’ prices.

x_axis = np.arange(36–23)y1, y2 = data[‘Garis Kemiskinan Desa’].iloc[23:36], data[‘Garis Kemiskinan Kota’].iloc[23:36]plt.barh(x_axis, y1, height=0.3, align=’center’)plt.barh(x_axis+0.3, y2, height=0.3, align=’center’)ax.set_yticks(x_axis+0.15)ax.set_yticklabels( [i for i in data[‘Tahun’].iloc[23:36]] )ax.set_xlabel(‘Poverty Line (Rp)’)plt.title(‘Poverty Line Comparison in Urban and Village Area’)plt.legend([‘Village Poverty Line’, ‘Urban Poverty Line’])

output:

The last plot is actually the same as the previous one. Since Q2 2017, the BPS (Indonesia’s Central Bureau of Statistics) website only shows one single poverty line threshold, and judging from the value, it was most likely generalized based on the previous urban threshold. So, in this code below we’re going to plot the urban poverty line threshold from Q1 2011 up until Q1 2017, extended with the general poverty line threshold from Q2 2017 up until Q1 2020. The output graph showing that the threshold is always increasing but the step-up percentage fluctuates until Q1 2018 — meanwhile you can see in Q3 2011 to Q1 2012 and Q3 2017 to Q1 2018 showing an almost flat horizontal graph meaning the climb-up rate is remarkably low. On another side, Q3 2018 to Q1 2020 showing an almost straight line meaning the climb-up rate is pretty much stable and constant.

fig = plt.figure(figsize=(20,8))y = data[‘Garis Kemiskinan Kota’].iloc[-19:]x = data[‘Tahun’].iloc[-19:]for i,j in zip(x, y):plt.annotate(int(j), (i,j))plt.ylabel(‘Poverty Line (Rp)’)plt.title(‘Indonesia\’s Poverty Line Graph’)plt.plot(x,y,marker=’o’, markersize=10, markerfacecolor=’white’, linestyle=’:’)

output:

So that was a simple graph plotting of Indonesia’s poverty overview year-by-year. It was just the simple one, you can explore matplotlib more — you can make interactive graphs and other interesting features with it more than I show you here, but still tho.. I do really hope this article was useful and that you could get a broad overview about the use of matplotlib for data plotting, hehe. Anyway, thank you for reading and don’t forget to always be healthy. ✨

--

--

Aron Akhmad

〖A data geek 📊〗〖Life-long learner〗〖ESFP-T〗〖✨ŸØⱠØ✨〗