Jan Červenka

Implementing an MCMC Sampler for Bayesian Inference

Imagine one of the most common problems in statistics. We have two distinct data samples and we want to analyze how likely they come from the same population.
How Well Can Language Models Answer Questions in Czech?

And how likely are they to hallucinate? In this blog post, I am setting up a Czech language eval based on SimpleQA to investigate this. You can find it on my GitHub.
Supervised Learning Using Kernel Density Estimation

In this post I want to demonstrate that the distinction between supervised and unsupervised learning is somewhat arbitrary. Specifically, I want to solve a supervised learning problem (binary classification) using an unsupervised learning algorithm (kernel density estimation).
Using a Language Model for Collaborative Filtering

Collaborative filtering is a popular method for solving recommendation tasks. Can a language model do it? Let’s find out.
Assigning Probability to Election Outcomes

Is it a good idea? I’m not sure but that’s not going to stop us. The goal of this post is to build a model for the 2024 United States presidential election.
Computing Hessians with JAX

In the previous post we showed that cross-entropy (spoiler alert if you haven’t read it) is convex. However, training a model using cross-entropy loss doesn’t have to be a convex optimization problem.
Is Cross-Entropy Loss Convex?

The other day I saw a list of interesting data science problems with one of them being “show that cross-entropy loss is convex”. Let’s look into it!
On the Complexity of Python's list.append

Let’s start with a simple example. There is a list x = [] to which we add new elements using x.append. After each append, we call sys.getsizeof(x) to check how many bytes it takes to store x in memory.
Continuous Deployment of LaTeX Documents

I store my resume as a source .tex file in a GitHub repository. When I need to send it to someone, I either have to find the most recent version in my emails or go through the process of rebuilding it.
Jupyter Explosions and Kernel Inheritance

When working with Jupyter notebooks, I often struggle with the combinatorical explosion that inevitably happens as you explore your problem space.
Intro

Hey! This is a start of a new blog. I am going to write mostly about data science, scientific computing, and software engineering. It is my first attempt at blogging so let’s see how it goes!