2026-02-04
This project extracts and validates human-like conversational parameters by analyzing Ubuntu IRC channel logs — then uses those parameters to make LLM agents converse more naturally.
The project is split into two parts:
Project 1 (Preprocessing) handles IRC data parsing, thread disentanglement, and parameter extraction. Project 2 (LLM Test) runs comparison experiments with metrics toggled on and off to measure how much these parameters actually matter.
IRC channels are chaotic — dozens of conversations happening simultaneously in a single stream. The first challenge was separating this mess into individual threads.
I built a semantic time-decay algorithm that weighs three signals:
| Signal | Behavior |
|---|---|
| Participant continuity | Maintained regardless of time gap |
| Semantic similarity | Decays exponentially over time — exp(-Δt/τ) |
| Mentions (@user) | Always merged |
With a time scale (τ) of 120 seconds and a merge threshold of 1.0, this turned 8,951 raw utterances into 1,318 coherent threads.
Real humans don't respond instantly. They pause, think, type. I extracted timing distributions from the IRC data:
| Response Type | Instant Reply Rate | Delay Formula |
|---|---|---|
| Quick | 71% | 3–10 seconds |
| Normal | 69% | 3–10 seconds |
| Detailed | 62% | 10 + words×1.0 + tech×20 |
Humans don't dump everything into one message. They chunk:
67% of messages are single-chunk, 21% are split into two, and 12% into three or more. Each chunk maxes out at about 15 words (the 75th percentile of human messages).
These parameters — timing delays, chunking ratios, thread awareness — are what separate an LLM that sounds like a chatbot from one that feels like it belongs in the conversation.
| Item | Value |
|---|---|
| Source | Ubuntu IRC #ubuntu (Libera.chat) |
| Period | January 2024 (1 month) |
| Utterances | 8,951 |
| Threads | 1,318 |