The best vector databases for storing embeddings


Delve into the World of Vector Databases Fueling NLP's Transformative Journey.

Continue reading

Intrinsic vs. Extrinsic Evaluation - What's the Best Way to Measure Embedding Quality?


Learn how to measure the quality of word and sentence embeddings in natural language processing (NLP), including intrinsic and extrinsic evaluation, and their strengths and limitations.

Continue reading

Rethinking the Link Between Speech and Expertise


We often associate eloquent speech with intelligence and knowledge. But what if I told you that this assumption is not always true?

Continue reading

Understanding AI with ELI5 - Demystifying Decisions (tutorial)


Want to know why your AI model made that decision? ELI5 has got you covered. Let's dive into Explainable AI with ELI5.

Continue reading

Comprehensive guide to interpreting R², MSE, and RMSE for regression models.


Don't let misleading metrics fool you. Master the art of analyzing regression model performance and make smarter decisions.

Continue reading

Libraries for Automated Exploratory Data Analysis (EDA)


EDA Made Easy - Discover Top-10 Python Libraries That Will Take Your Data Analysis to the Next Level! Learn the Secrets of Automated EDA!

Continue reading

Beat Overfitting in Kaggle Competitions: Proven Techniques


Ready to take your Kaggle competition game to the next level? Learn how to recognize and prevent overfitting for top-notch results.

Continue reading

New Cognitive Skills in the Age of AI Tailored Information Presentation


Exploring the new cognitive skills of tomorrow with advanced AI generative models.

Continue reading

The Impact of Search Engines and AI Generative Models on Mental and Cognitive Capabilities


Understand the effects of search engines and AI on our mental and cognitive capabilities. Equip yourself with the knowledge you need to make informed decisions about your own usage of these technologies.

Continue reading

Leveraging Language Models in Corporate Environments: The Future of Knowledge Management


Explore the benefits and challenges of using Large Language Models (LLMs) in corporate environments for improved knowledge management. Learn how to implement LLMs and overcome potential obstacles.

Continue reading

Becoming a Data Wizard: The Benefits of Learning Databricks


Learn how Databricks can help you master big data, improve data processing and machine learning skills and excel in your career. Boost your career with this powerful platform.

Continue reading

Common Types of Data Science Projects


Learn about common types of data science projects and best practices for approaching them. From end-to-end individual work to production-ready projects, this guide covers it all.

Continue reading

How to Detect ChatGPT-Generated Text?


Discover the latest methods for distinguishing machine-generated text from the human-written text. Learn about statistical, syntactic, semantic, and neural network-based approaches. Stay up-to-date with the latest research in NLP and AI.

Continue reading

Visual Text Exploration as Part of Preprocessing Before Classification


This post discusses importance of visual text exploration in preprocessing for classification, covers techniques (wordcloud, Sentiment Analysis, topic modeling, data cleaning) & how to use them with popular libraries. Encourages readers to try for own projects.

Continue reading

Discovering Hidden Gems - Popular and Lesser-Known Dataset Sharing Platforms


Looking for the key to unlocking valuable datasets? Dive into the world of Kaggle, UCI, and more as we unveil the best platforms for data enthusiasts. 🗝️

Continue reading

Top 10 Python Libraries for Document Classification


Unlock the power of document classification with these top Python libraries! Discover the best tools for effortless text analysis and more.

Continue reading

Pro Tips for Diagnosing Regression Model Errors


Improve your regression model's accuracy and predictability by uncovering hidden errors with these essential plots.

Continue reading

15 tools for document Deskewing and Dewarping


Sometimes input for document processing tasks such as OCR, table detection or text segmentation can be scanned or photo taken from hand that do not have ideal perspective - is rotated or spatially distorted in some way (warped document). If you are looking for my recommendations go straight to the last section of this article "Summary and recommendations".

Continue reading

Understanding Micro and Macro Averages in Multiclass Multilabel Problems


Learn about micro and macro averages in multiclass multilabel problems, the difference between multiclass and multilabel problems and when to use micro and macro averages.

Continue reading

Unleashing the Power of T-SNE for Dimensionality Reduction in Python


Want to create beautiful visualizations from complex data? Discover the power of T-SNE for dimensionality reduction in Python.

Continue reading

Kurtosis in simple terms, interpretation and gotchas


Statistics can be tricky, but understanding kurtosis is a must for anyone who wants to avoid making common mistakes in statistical analyses. Learn how to interpret it in this comprehensive guide.

Continue reading

Finding Errors in Data - Data Validation


Explore methods to detect & fix errors in data, including validation, visualizations, statistical tests, cleaning techniques, machine learning & data quality tools. Get concise, easy to understand information with examples & links to external resources.

Continue reading

Pandas Schema Validation


Overview of the available tools and methods for schema validation in pandas, examplary code snippets and recommendation for when to use given tool.

Continue reading

Metrics Used to Compare Histograms


Learn about metrics used to compare histograms with examples of how to calculate them in python. From Chi-Squared distance to Kullback-Leibler divergence and Earth Mover's distance. A comprehensive guide.

Continue reading

Kaggle evaluation metrics used for regression problems


This post describe evaluation metrics used in Kaggle competitions where problem to solve is has regression nature. Eight different metrics are described, namely: Absolute Error (AE), Mean Absolute Error (MAE), Weighted Mean Absolute Error (WMAE), Pearson Correlation Coefficient, Spearman’s Rank Correlation, Root Mean Squared Error (RMSE), Root Mean Squared Logarithmic Error (RMSLE), Mean Columnwise Root Mean Squared Error (MCRMSE).

Continue reading

What's cooking


Exploratory Data Analysis of the Kaggle's "What's cooking" competition dataset to get understanding what kind of data we are dealing with and get intuition of existing dependencies.

Continue reading