Data Science Command line Tools


Description of GNU utils and other less standard tools that helps with processing data from CLI or with shell scripts.

Continue reading

Finding the spy - post on Markov Chains and stochastic matrices


Using puzzle on tracing the high profile spy as excuse to showcase Markov Chains and demonstrate usage and properties e.g. Stationary distribution

Continue reading

Top CLI tools that improve work in console in 2019


On Github project Awesome Zsh plugins you can find 1700+ links to plugins, themes and Zsh plugin managers/frameworks. The number of tools listed on that page is high and it is difficult to get orientation which plugins gained already good reputation from Zsh users community. This post aims at identifying most popular tools where popularity is measured with the number of stars that Github users added to given plugin or tool.

Continue reading

Learn Bayesian methods in 4 steps - by reading and by doing


This post propose 4-steps path for learning Byesian methods. First step is goint through the book: “Bayesian methods for hackers”, second, use complementary books for probability and statistics, third, read How to become a Bayesian in eight easy steps: An annotated reading list”, and last, go throught the book full of exercises: “Think Bayes”.

Continue reading

Kaggle evaluation metrics used for regression problems


This post describe evaluation metrics used in Kaggle competitions where problem to solve is has regression nature. Eight different metrics are described, namely: Absolute Error (AE), Mean Absolute Error (MAE), Weighted Mean Absolute Error (WMAE), Pearson Correlation Coefficient, Spearman’s Rank Correlation, Root Mean Squared Error (RMSE), Root Mean Squared Logarithmic Error (RMSLE), Mean Columnwise Root Mean Squared Error (MCRMSE).

Continue reading

How to install TensorFlow and Keras on Windows 10


Guide on how to install TensorFlow cpu-only version - the case for machines without GPU supporting CUDA. Step-by-step procedure starting from creating conda environment till testing if TensorFlow and Keras Works.

Continue reading

Darwin Approach to Traveling Salesman


Can evolutionary approach crash the problem that brute forcing will last far more that the age of universe? This post shows how to attack Traveling Salesman Problem using Darwin approach. I’m describing evolution model and design decisions. See the animation how the population was evolving through the epochs.

Continue reading

How to organize Data Science project based on Jupyter notebook


Having several notebook-based projects behind you might result in mess in projects directory. Organize your Data Science project based on Jupyter notebooks in a way that one can navigate through it. Especially that “the one” will be most probably you in few months time. To achieve that: keep your projects directory clean, name the project in a descriptive way and take care of internal structure of the project.

Continue reading

What’s cooking


Exploratory Data Anlysis of the Kaggle’s “What’s cooking” competition dataset to get understanding what kind of data we are dealing with and get intuition of existing dependecies.

Continue reading

Blockchain implementation


Python implementation of blockchain in few lines of code.

Continue reading

Top popular Zsh plugins on Github


There is an exhaustive but curated list of Zsh plugins posted on Github project Awesome Zsh plugins. You can find there 800+ links to plugins, themes and Zsh plugin managers/frameworks. Even though it is a collection of awesome stuff the number is a bit high get orientation which plugins gained already good reputation from Zsh users community. In this post I will identify most popular plugins - those which have highest number of stars.

Continue reading