Imagine one of the most common problems in statistics. We have two distinct data samples and we want to analyze how likely they come from the same population.
And how likely are they to hallucinate? In this blog post, I am setting up a Czech language eval based on SimpleQA to investigate this. You can find it on my GitHub.
In this post I want to demonstrate that the distinction between supervised and unsupervised learning is somewhat arbitrary. Specifically, I want to solve a supervised learning problem (binary classification) using an unsupervised learning algorithm (kernel density estimation).
Collaborative filtering is a popular method for solving recommendation tasks. Can a language model do it? Let’s find out.
Is it a good idea? I’m not sure but that’s not going to stop us. The goal of this post is to build a model for the 2024 United States presidential election.
In the previous post we showed that cross-entropy (spoiler alert if you haven’t read it) is convex. However, training a model using cross-entropy loss doesn’t have to be a convex optimization problem.
The other day I saw a list of interesting data science problems with one of them being “show that cross-entropy loss is convex”. Let’s look into it!
Let’s start with a simple example. There is a list x = []
to which we add
new elements using x.append
. After each append, we call sys.getsizeof(x)
to check how many bytes it takes to store x
in memory.
I store my resume as a source .tex
file in a GitHub repository. When I need to send it to someone,
I either have to find the most recent version in my emails or go through the process of rebuilding it.
When working with Jupyter notebooks, I often struggle with the combinatorical explosion that inevitably happens as you explore your problem space.
Hey! This is a start of a new blog. I am going to write mostly about data science, scientific computing, and software engineering. It is my first attempt at blogging so let’s see how it goes!