Most property insurers today still rely on a guy with a ladder and a camera on a stick to perform physical inspections and assess risk. But smart insurers are enlisting the help of AI researchers who have developed platforms that can evaluate thousands of publicly available images and other data points on the web to deliver a risk assessment within seconds.
“We make sure that the insurer can access that data very, very quickly, especially if it’s being used in a quote engine,” said said Ryan Kottenstette, CEO at Cape Analytics, a deep learning company that provides predictive risk analysis…
I recently wrote two introductory articles about processing Big Data with Dask and Vaex — libraries for processing bigger than memory datasets. While writing, a question popped up in my mind:
Can these libraries really process bigger than memory datasets or is it all just a sales slogan?
This intrigued me to make a practical experiment with Dask and Vaex and try to process a bigger than memory dataset. The dataset was so big that you cannot even open it with pandas.
I was recently contacted by a recruiter from a Big Tech company. Why now and never before?
In this article, I present my theory of why a recruiter contacted me for a Senior Data Science position. You can use my theory (and develop it further) to increase your chances of getting contacted by Big Tech company.
Many Software Developers dream about working for a Big Tech company. How do I know? I was one of them.
Pandas needs no introduction as it became the de facto tool for Data Analysis in Python. As a Data Scientist, I use pandas daily and it never ceases to amaze me with better ways of achieving my goals.
For pandas newbies — Pandas provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
The name pandas is derived from the term “panel data”, an econometrics term for datasets that include observations over multiple time periods for the same individuals.
In this article, I’m going to show you 5 pandas tricks that will make you more productive…
Start the New Year with one of the best New Year’s resolutions: Learn more Python.
You can start with this article in which I present 5 Python tricks that will make your life easier.
You’ll learn:
Have you ever planned you’d need an hour to finish a short task, but then you spend a whole day working on it? If yes, welcome to my world!
In this article, I present 3 pandas mistakes that took me much longer to solve than they should. I also share the link to the Notebook with examples at the end of this article.
Only a fool learns from his own mistakes. The wise man learns from the mistakes of others.
See my pandas articles to learn more about Data Analysis with pandas:
One of the most common misconceptions in Machine Learning is that ML Engineers get a CSV dataset and they spend the majority of the time optimizing the hyperparameters of a model.
If you work in the industry, you know that’s far from the truth. ML Engineers spend most of the time planning how to construct the training set that resembles real-world data distribution for a certain problem.
When you’ve managed to construct such training set, just add a few well-crafted features and the Machine Learning model won’t have a hard time finding the decision boundary.
In this article, we’re going…
scikit-learn is my first choice when it comes to classic Machine Learning algorithms in Python. It has many algorithms, supports sparse datasets, is fast and has many utility functions, like cross-validation, grid search, etc.
When it comes to advanced modeling, scikit-learn many times falls shorts. If you need Boosting, Neural Networks or t-SNE, it’s better to avoid scikit-learn.
scikit-learn has two basic implementations for Neural Nets. There’s MLPClassifier for classification and MLPRegressor for regression.
While MLPClassifier and MLPRegressor have a rich set of arguments, there’s no option to customize layers of a Neural Network (beyond setting the number of hidden…
Every day there’s more and more educational content about Machine Learning. With such a high volume of new content, it’s easy to get confused. Many aspiring Data Scientists don’t know where or how to start learning.
These three questions pop up regularly in my inbox:
In this article, I give answers to the questions above and I also present a better way on…
I get many messages asking for advice from aspiring Data Scientists. I am no expert in career advising so take everything that I write with a grain of salt.
I give advice based on my observations of the field and the experience that I’ve developed over the years. This is me, advising younger me as I had similar questions at the start of my career.
My advice would be to start with practical projects and then slowly progress with theory. Kaggle notebooks are a great way to learn the practical part.
Ask questions in Reddit communities or in Cross Validated…