If you are Machine Learning enthusiastic and haven't heard about this AWS service yet, introducing SageMaker.

If you're a ML enthusiastic, that's definitely a must-have tool at your modeling skill arsenal. I was also looking forward to write about this feature, which's the one that got me to know all of the other AWS services a little better. Hope you like it.

AWS SageMaker

SageMaker is an end-to-end service that supports all of your modeling stages. It allows you to preprocess, train, test, analyze, deploy and monitor multiple machine learning models, wether using proprietary scrips or loading existing AWS algorithm images.


Hey there!

The quick tip of today's post will be over password exposure:

Avoid doing it.


Before you Go

It's also a good practice not to have your credentials hardcoded on your script, specially if you intend to commit it to a public repository.

For example, if I need to query information over any database server, you do not want to have things laid out like this:

AWS SecretManager is a managed credential tool that let you store your secrets and set a secret_id to identify it. …

Dear reader, I hope you're well.

I'm great. Thanks for asking. If you're wondering — yes! This is a "not that useful solution, but perfect use-case to learn cloud features" post.

My idea is to walk you through a "solution" I've come up with to assist my girlfriend's job at her periodical sanitary surveillance visits. These inspections are very pragmatic and request going over a specific list of items related to the restaurant's working practices and check whether everything is being done accordantly.

I know what you're thinking right now:

Dude.. that's way too specific. What is sanity survival visits?

Hey there,

Getting the bad jokes from this first picture aside. The evolution of decision making process did pass through a transformation on this decade. Increase on volume of data as the main driver for it.

For all sort of methods used during any analysis, the complex it can be, they have data as common ingredient. On this post I will introduce a simplistic approach for web-scraping tasks, BeautifulSoup.

Steps from this reading

  1. Search for a house rent announces website that allows you direct requests.
  2. Scrape and pre-process the data from these announces, in order to find common attributes from rentals.
  3. Train a regression…

Hey there,

AWS Lambda is currently one of the top used resources for so-called Serverless Applications. The versatility abstracted from this tool can't be misused with an excuse like - "it does not contain libs X or Y in its environment."

In this post, my goal is to share a handy tip that comes along with Lambda Functions: your own built layers.

A Layer allows you to deploy .zip files containing any sort of dependency need for the function, such as code libraries or custom runtimes. In what follows, I will demonstrate how to

  1. Install Python libraries on a specific…

Hello there!

If you ever read some of my posts so far, you probably know I'm developing a special series called Practical Implementation.

In this series, it's quite obvious the amount of details given on the pre-processing stage for each algorithm I put in practice. Not because the theory behind these models are not important, they are. A lot.

But at the end of the day, knowing when to use the algorithm X or Y is enough to put your model in production. …


Please allow me to start this post by pretending I have more than a handful of recurring readers (Hey mom!) and apologize for this long period without releasing new posts. Things have been very rushed at work but I'm planning to return a solid posting frequency now.

As a data-driven decision-maker, I couldn't ignore the fact that the topic of Ibovespa Stock Price's Scraping had the highest return on my page so far. So today’s topic will be relating the same purpose of combining programming languages and the financial market.

If you haven't caught the stock price's yet…

Hello there!

Today I’ll make a quick complement from K-means Algorithm Practical Implementation with Python. If you didn’t have the chance of reading it yet, you can find it below:

If you record it, we and up clustering our data and at some point we had our DataFrame modeled like this:

Four clusters were found!

On the last post, I didn't talked much about plotting. Although, this might be the coolest part on cluster creation.

On this post I just wanted to bring out a quick tip on that. I'll use plotly.graph_objects library to create this 3d plot.

Warning: we…

Hello there!

Today I'll make a quick complement from Tree Based Decision Model for Classification — Practical Implementation. If you didn't have the chance of reading it yet, you can find it below:

Last time we used Decision Tree Classifiers to predict an animal Class Type based on multiple numeric features.

We know some logic was used to determinate the classes each animal belongs to, but which was it? Where did the algorithm performed each split?

Today I want to show you a quick tip on how to visualize these Tree Nodes, and consequently, the path of decisions.

The libraries

Hello there!

It's a known fact that Machine Learning Algorithms will reach higher accuracy levels according to the amount of information you feed to them of a certain class/target.

A balanced target not always comes that easily, specially when it comes to rare events. That's when balancing techniques like SMOTE comes in handy.

This function allows you to generate artificial data that will mirror the statistical relations between the targets while balancing the dataset.

To show how the practical application works, I'll use a dataset that represents one of the most famous case of rare events. …

Rodrigo Dutcosky

Fraud Analytics Coordinator at EBANX

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store