Arquivos para Data Science

Sensing the Air Quality

22 22-03:00 agosto 22-03:00 2019 — 1 Comentário

A low-cost IoT Air Quality Monitor based on RaspberryPi 4


I have the privilege of living in one of the most beautiful countries in the world, but unfortunately, it’s not all roses. Chile during winter season suffers a lot with air contamination, mainly due to particulate materials as dust and smog.


Because of cold weather, in the south, air contamination is mainly due to wood-based calefactors and in Santiago (the main capital in the center of the country) mixed from industries, cars, and its unique geographic situation between 2 huge mountains chains.


Nowadays, air pollution is a big problem all over the world and in this article we will explore how to develop a low expensive homemade Air Quality Station, based on a Raspberry Pi.

If you are interested to understand more about it,  please visit the “World Air Quality Index” Project.

Continue lendo…

How safe are the streets of Santiago?

16 16-03:00 agosto 16-03:00 2019 — 1 Comentário

Let’s answer it with Python and GeoPandas!

Costanera Center, Santiago / Benja Gremler

Some time ago I wrote an article, explaining how to work with geographic maps in Python, using the “hard way” (mainly Shapely and Pandas): Mapping Geography Data in Python. Now it is time to do it again, but this time, explaining how to do it in an easy way, using GeoPandas,  that can be understood as Pandas + Shapely at the same package.

Geopandas is an open source project to make working with geospatial data in Python easier. GeoPandas extends the datatypes used by Pandas to allow spatial operations on geometric types.

The motivation for this article was a recent project proposed by our professor Oscar Peredo and developed with my colleagues, Fran Gortari and Manuel Sacasa for the Big Data Analytics course of UDD’s (Universidad del Desarrollo) Data Science Master Degree.

bannerThe objective of that project was to explore the possibility of, taking advantage of state of the art Machine Learning Algorithms, to predict crash risk score for an urban grid, based on public car crash data from 2013 to 2018. By the other hand, the purpose of this article is simply to learn how to use GeoPandas,  on a real problem, answering a question:

“How safe are the streets in Santiago?”.

If you want to know what we have done with the proposed project for our DS Master deegre , please visit its GitHub repository.

Continue lendo…

The idea with this tutorial is to capture tweets and to analyze them regarding the most used words and hashtags, classifying them regarding the sentiment behind them (positive, negative or neutral).

Continue lendo...