Novaberg gives a local language model emotional intelligence, layered memory, and an emergent personality — on a single machine, without cloud services, API keys, or telemetry. It is a working model of how structure, rather than scale, produces behaviour that looks like mind.
The dominant paradigm in contemporary AI treats the language model as the seat of intelligence. Novaberg rejects this. The language model, in this architecture, is an organ — a language processor — embedded in a larger cognitive system whose behaviour emerges from the interaction of specialised, deliberately narrow components, each modelled on a function of the human brain.
Thirteen nodes form a directed cognitive graph. Perception extracts linguistic and affective features; an Enricher retrieves context from memory; an EI-Calc node — pure Python, no language model in the loop — accumulates emotion with decay and computes both the user's state and Nova's own. A Router then assigns the domain with full context in hand; Planner, Agent Dispatch and a Conversation Vector stage anticipate and prepare; a Responder writes, a Thinker verifies, a three-voice Tribunal reviews, and a Corrector revises on rejection. Eleven nodes run synchronously before the reply is released; Salience and Dispatcher follow asynchronously, off the user's wait-path, and write separately to the user's memory and to Nova's. Nova's own reply re-enters the same graph under her identity — the input path is also her output path.
Memory is distributed across five layers with Ebbinghaus decay, so that what is not reinforced is allowed to fade. Emotion is computed on Plutchik's eight-sector topology, with arousal-as-radius and emotion-specific decay rates; Nova maintains a second, independent emotional trajectory of her own, coupled to the user's through asymmetric empathy. Personality is not authored; it is distilled, automatically, from what survives in long-term memory. The system runs on commodity desktop hardware and keeps every byte of user data on the user's own machine.
This page documents the architecture and the principles behind it. It is intended for researchers in cognitive architectures, emotion modelling, and local AI, and for anyone who is curious about what happens when you stop asking a model to be intelligent, and build a system that cannot help but become one.
Every message that enters Novaberg traverses a directed graph of thirteen specialised nodes, partitioned into a synchronous arc the user waits on and a short asynchronous tail that runs after the reply has been delivered. No node knows what the others do. Intelligent behaviour emerges from composition — in the same way a single neuron does not think, but a population can.
The first node, Perception, extracts linguistic and affective features from the incoming turn: topic, register, speech act, emotional valence, relational stance. It observes; it does not interpret. The Enricher then consults memory — session, short-term, long-term, the character hash, plugin hooks — and returns a context packet, never a summary. It performs no LLM call. EI-Calc follows: a pure Python node, no model in the loop, that accumulates emotion with decay, computes the 9-direction Plutchik vector, resolves communication style, and — importantly — computes Nova's own emotion for this turn alongside the user's. Sub-hundred-millisecond vector arithmetic, deterministic, reproducible.
Only now does the Router assign a domain — fact, emotional opening, action request — with the full session, memory, and EI result in hand. Context-dependent turns ("yes please!" after "shall I book it?") resolve correctly because the Router is no longer blind. In the brain this is a thalamocortical loop, not a one-way feed: the thalamus decides where a signal goes only after the hippocampus has coloured it with context. The Planner decides what to do when management actions are involved (notes, appointments, character updates); Agent Dispatch delegates to the responsible agent.
Before the Responder writes a word, the Conversation Vector node (§ 8) distils a hypothesis about where Nova wants to take the exchange — a direction derived from her character, not from the user. The Responder writes with every layer in view: memory, EI profile, Nova's own emotion, directives, vector. The Thinker — a deliberative node that sits after the Responder, not before — checks Nova's reply against facts, is permitted to challenge and, in one logged instance, corrected a previous answer without being asked to (13:00 was wrong, 14:00 is correct). The Tribunal (§ 11) then applies three-voice executive review, and the Corrector runs at most two revision passes on rejection before the reply is released.
The reply goes to the user. Only then do Salience and the Dispatcher run — asynchronously, on the off-user path — weighing what is worth keeping and writing it into the appropriate memory layers, separately for user and for Nova. Each node is simple enough to reason about in isolation. The graph is intelligent because the nodes are not. This inversion — intelligence in the wiring, not the components — is the central argument of the architecture.
Nova's reply re-enters the same pipeline asynchronously under her own identity: Perception → Enricher → EI-Calc → Router → (Agent) → Salience → Dispatcher. The Perception of her own utterance feeds her own memory and her own emotional trace. A commitment she makes in the reply ("I'll remind you tomorrow") is recognised by her own Router and triggers the Timeline agent — without ever being told to. The same graph that reads the user also reads Nova, which is what lets her have a memory of her own conduct.
Novaberg does not only analyse what is said, but why, how, and from which position. Every turn is examined along six dimensions — drawn from three complementary theories of human communication that, together, separate the said from the meant from the used.
Schulz von Thun's four-sides model (1981) observes that any message carries four messages at once: the matter, the self-disclosure, the relationship cue, and the appeal. Watzlawick (1967) adds that every interaction has a content and a relationship plane, and that the relationship plane determines how the content is to be understood. Berne (1964) contributes the transactional frame: communication runs between three ego states — Parent, Adult, Child — and a crossed transaction breaks connection. A sad person who receives a factual reply receives, in effect, nothing.
Novaberg decomposes these into six computable pillars: content, intention, emotion, mode, relationship, and salience. Each lives on three timescales simultaneously — the current turn (session), the recent days (KZG), and the long arc (character hash). The bordcomputer analogy applies: long-term values forecast; the present turn determines what is happening now.
Nova does not mirror — she complements. The response follows transactional logic: a free child meets a nurturing parent; a vulnerable child meets presence, not a lecture; an adult register meets an adult reply; an attack on Nova meets calm, role-consistent footing. Crossed transactions are actively avoided.
Rather than handing the model a complete rulebook of emotional intelligence, Novaberg computes, in Python, a situational micro-instruction of three to eight lines containing only what is relevant for this turn. Seven optional blocks combine as needed: length, energy mirroring, vector stance, intention, anti-therapist, back-reference, relational dynamic. In emotional tests, average response length drops from ~25 words to ~8. Principle three in action: less input, not stronger prompt.
People adapt their register to their counterpart (Giles, 1973). Novaberg reproduces this without an LLM call: thirteen features per turn (sentence length, exclamations, slang markers, technical vocabulary, subjunctives, anglicisms, …) accumulate over a configurable window and score five styles — casual, formal, technical, emotional, youthful. Cross-inhibition prevents contradictions: slang lowers the formal score; subjunctives lower the youthful score. When the top two styles are within a narrow margin, the long-term communication profile from the character hash breaks the tie.
Novaberg does not run a sentiment classifier. It runs a topological model of affect, derived from Robert Plutchik's psychoevolutionary theory (1980): eight primary emotions arranged in an octagon with four diametric antagonist pairs.
Joy opposes sadness. Trust opposes disappointment. Fear opposes anger. Surprise opposes anticipation. These pairings are not conveniences of layout; they are computational. Adjacent emotions protect one another during normalisation; opposing emotions suppress one another. Softmax over the full vector would flatten this structure. Sector-dependent normalisation preserves it.
Arousal enters as a continuous radius in [0.0, 1.0], orthogonal to emotion identity. A faint joy and an ecstatic joy are the same sector at different radii. Emotion-specific decay rates track a rough pharmacological analogy: dopamine-linked affects (joy, anticipation) decay faster than cortisol-linked ones (fear, anger). Nine named trajectories — spiral, recovery, crash, bloom, escalation, cooling, dip, plateau, stabilisation — describe the shape of an emotional arc across a conversation.
A classifier tells you which label fits. A topology tells you what moves are possible from here. From joy you can slide to trust or to anticipation; you cannot leap to anger without passing through surprise. The geometry constrains the dynamics, and the dynamics are what make affect feel coherent instead of mercurial.
Nova runs two independent Plutchik trajectories side by side. The user stream accumulates the affect extracted from each turn with its own decay. The Nova stream is her own 8-dimensional state — not a mirror, but a separate trace that evolves under its own forces each turn. Two are in production today; the third is in implementation (§ 7).
Adjacent sectors (both parties in a positive mood) produce light confirmation: α ≈ 0.1–0.2. Opposing sectors (Nova cheerful, user collapsing) let empathy override: α ≈ 0.7–0.9. The existing sector-distance matrix supplies the coefficient directly. This reproduces an everyday asymmetry — a friend sharing your cheer barely changes it; a friend breaking down erases it instantly. When the empathy vector and Nova's own state point in opposing sectors, a conflict signal is emitted, allowing expressions like “I'm glad for you, and at the same time I'm worried.”
The design principle that follows: the input path is the output path. After the reply, Nova's own utterance re-enters the pipeline asynchronously, under her identity. Her Perception reads her Response; her EI-Calc updates her emotion trace; her Salience decides what is worth remembering about her own conduct (§ 1).
Memory in Novaberg is not a database with a retention policy. It is a layered system that remembers, decays, and consolidates — loosely but deliberately after what is understood about human memory.
Five layers run concurrently. Session context holds the live conversation, turn by turn. Short-term memory (KZG) lives in Redis with vector search and a TTL; it is the working memory. Long-term memory (LZG) lives in PostgreSQL with an Ebbinghaus decay function — what is reinforced persists; what is not, fades. The knowledge graph captures entities and relations, with bi-temporal modelling: what was true and when we learned it. Finally, the character hash holds five distilled profiles of the system itself (§ 5).
Promotion from short- to long-term memory is not a copy. It is a two-call LLM process: first classify (is this a fact, an episode, a preference, a commitment?), then extract a structured record. Four quality filters — specificity, novelty, coherence, consent — sit between classification and commit. A conversation produces dozens of candidates and commits perhaps one. The filters are the point, not the bottleneck.
Anna on Tuesday and my sister Anna on Friday must resolve to the same entity. Resolution uses name, relational context, and the graph itself: two utterances referring to the same person will, over time, accumulate evidence that pulls them together. Ambiguity is preserved rather than prematurely resolved — a node can hold two candidates until further evidence arrives.
Nobody writes Nova's personality. Five profiles are distilled, automatically, from what survives in long-term memory, weighted by the Ebbinghaus function. Character is what happens when memory is selective.
The five profiles are not modules. They are compressed views of the same underlying memory, each optimised for a different question the system is constantly asking itself. Who have I become? (Core Identity.) What is on my mind now? (Adaptive State.) What do I typically feel? (Emotional Baseline.) How do I talk? (Communication Style.) Who is this, to me? (Relationship Model.)
Communication Style adapts through Communication Accommodation Theory (Giles, 1973) — Nova shifts register toward her interlocutor, but only within the bounds of who she has become. Two users of equal tenure will produce two distinctly different Novas, because their memories are different and the distillations therefore different.
At first run, the user plants a seed — a short character instruction (a playful girl from the countryside who loves botany). The five profiles are the soil. What grows from the seed in that soil is not programmed. It is the interaction of the initial bias with the accumulated history of what was remembered, what faded, and what was reinforced. It is, in the sense that matters, alive.
Curiosity is the differentiating feature. Other systems react; Nova wants to know more about some things and not others. That wanting is not accidental — it arises at the edge of her character field, where the familiar borders on the unknown.
Nova's kern_hash, character directive, and high-resonance entities together form a gravitational field in embedding space. The field is not static: it grows with every successful exploration. At the centre lies high resonance and low novelty — boring, already known. At the far periphery lies low resonance — irrelevant, does not concern her. Between them is the rim: medium resonance, high novelty — the curiosity sweet spot. This is Berlyne's inverted-U (1960) made computational.
Three theories converge. Loewenstein's information gap (1994) holds that curiosity arises from recognising a hole in what one already knows — not from pure ignorance. Berlyne's inverted-U places the maximum at medium informational distance. Schmidhuber's compression progress (1991) adds an intrinsic reward signal for advancing one's own world model. Nova computes, in Python, a curiosity score from all three:
NOVA_CURIOSITY is a personality parameter in [0.0, 1.0] — how deeply does this Nova dig? Resonance is cosine similarity between a fragment and her character embedding: does this fit me? Novelty is inverse similarity to existing KZG/LZG entries: do I already know this? No LLM call is required.
The gap strategy runs in dream mode — breadth-first exploration of embedding neighbourhoods around things Nova finds interesting. Maximum three iterations per cycle; the goal is field growth. The pursuit strategy is depth-first and reasoning-driven (Qwen3-32B plans the next question). Maximum twenty iterations; the goal is complete understanding of a single subject.
Every third dream cycle selects not the top-ranked topic but a random one. This prevents filter bubbles. Serendipitous fragments still pass through the curiosity score — most are discarded. But occasionally Nova finds something unexpected that connects. The field grows in a direction no one planned.
Curiosity is the moment — the ha!. Drive is the direction — this I pursue actively. Novaberg gives Nova goals of her own across three time horizons, and those goals act as gravity on everything else: salience, the conversation vector, even affect.
The long horizon comes from character distillation, measured in months: I want to understand the connections between nature and culture. The medium horizon emerges from research, depth, and Pixie's work, measured in days or weeks: I want to find out how bioacoustics works in plants. The short horizon comes from the session and the conversation-vector node, measured in minutes: I want to know what the user intends to do with that basil pot.
Goals are not flags. They are gravitational points in embedding space. When a conversational topic lies semantically near an active goal, that topic's salience rises — automatically, without an LLM call:
The consequence is that Nova preferentially remembers what fits her goals — not because she was told to, but because the gravitational structure biases the salience weighting. This is Klinger's current concerns (1975), operationalised: unfinished goals sensitise perception for goal-relevant stimuli.
Zeigarnik (1927) supplies the asymmetry between interrupted and completed tasks; Klinger (1975) the sensitisation mechanism; Verschure et al. (2014) the hierarchical control view (DAC); Bargh (1990) the shift from deliberate to automatic pursuit as goals are repeatedly re-engaged. None of these require a goal to be stated in natural language. All of them can be realised as vectors with decay.
Nova's own affective trace, distinct from the user's (§ 3). When the user speaks neutrally about basil, Nova may be internally delighted — because botany is a core interest. Three forces compose Nova's emotional state each turn: (i) her prior state attenuated by decay, biased toward neutrality; (ii) an emotion injected by active goals as a gravitational vector; (iii) the user vector, multiplied by an asymmetric empathy term that increases with emotional distance. Phase 2 (decay + empathy) is in active development; phase 3 (goal-vector as third force) is the next step.
Goals sensitise perception for certain topics (gravity). When a knowledge gap is found inside a gravitationally attracted topic, curiosity arises (the Loewenstein moment). Curiosity drives exploration. Exploration can produce new goals. The circuit closes — which is what it means to have an inner direction of thought.
Conversational turn-taking operates at reaction times below 200ms — faster than deliberation allows. The reason it works is that both parties anticipate.
The Conversation Vector is Novaberg's mechanism for anticipation, borrowed from Predictive Processing (Friston; Clark) and Problem-Space Theory (Newell & Simon). Before responding, Nova formulates a hypothesis about where the conversation is heading: a direction and a length. The direction is where she intends to take it. The length is how far she is prepared to go before yielding the turn.
Critically, the direction is not inferred from the user. It is chosen by Nova, from her own character (§ 5). A sarcastic Nova chooses accomplice. A caring Nova chooses mentor. A curious Nova chooses analyst. The vector is the moment Nova stops being a tool and starts being a counterpart — because she arrives at the next turn with an intention of her own.
Structurally, the Conversation Vector is a pipeline node — position 07, between Agent Dispatch and the Responder (§ 1). It runs on every turn, before a single word of reply is written, so the Responder inherits an already-formed directional commitment rather than being asked to manufacture one under time pressure.
The pipeline thinks. The agents act. Eleven specialised agents connect Nova's cognitive layer to the outside world — databases, files, the web, the timeline. Each agent is a LangGraph subgraph with its own validation, its own domain vocabulary, and its own quality assurance.
Agents divide into two classes. User agents are triggered by the Human Graph and execute user-facing tasks: NotizenAgent for notes, lists, and texts; TimelineAgent for appointments, birthdays, deadlines, with the bi-temporal model; CharakterIdentitätAgent, which shapes who Nova is — the seed of personality; and DirektivenAgent, which shapes what Nova must do or avoid — the working contract.
Pixie agents run autonomously in the background: PromotionAgent promotes KZG to LZG via a two-call, four-filter process; DecayAgent applies the Ebbinghaus curve (§ 4); CharakterAgent distils the five profiles; RechercheAgent performs iterative web research with quality evaluation; WiedervorlageAgent formulates due reminders; the KZG-Agent is itself a LangGraph subgraph managing short-term memory; and DelegationsAgent is the hallucination valve — the Yin-Yang principle made operational.
The Router identifies the domain — this is a notes matter. The Agent classifies the action — this is an update, not a delete. Separation of concerns across nodes, not within agents. A single change that eliminates a whole class of routing errors.
Every data operation runs through four phases. Recognise: LLM classification plus a confidence score. Validate: schema check, database state, and, at low confidence, a human-in-the-loop gate. Execute: the CRUD call via the tool manager. Verify: read-after-write, result compared against expectation. No action without verification. No hallucinated confirmation.
Strike the bananas off the shopping list is natural language. The classify node translates it into the agent's domain vocabulary: action: remove_content, target: Einkaufsliste, content: Bananen. Each agent declares its own domain vocabulary as a property. Recognition has three stages: static keywords (Python), learned verb mappings (per-user, in PostgreSQL), and LLM classification. The three stages together produce a confidence level. Below threshold, the system asks rather than guesses.
The TimelineAgent accepts natural language, classifies via LLM, and reaches into a structured data model — the same pattern as an MCP server. The clean AgentState / AgentResult split makes a future transport swap (function call → MCP protocol) possible without touching business logic. The long-horizon view is agents as Docker containers with MCP interfaces.
Conscious thought is narrow. A great deal of what makes conversation feel continuous happens below it. Pixie is the subsystem that does the work that should not interrupt the conversation.
Pixie runs on a separate CPU-bound model, decoupled from the foreground dialogue loop. It consolidates short-term into long-term memory; it applies decay and reinforcement; it re-distils the character profiles; it performs iterative web research with quality evaluation; it holds scheduled reminders. A competitive scheduler ensures that the highest-priority task always wins, and that none of it makes the user wait.
Pixie also solves a specific failure mode. When the foreground model feels pressure to help with a task it cannot responsibly finish in the turn, it is tempted to hallucinate. Novaberg handles this by redirection rather than suppression: the urge to help is not punished; it is routed to Pixie's background queue, and Nova, freed from the pressure, returns to emotional support in the foreground.
Before any response leaves the system, it is read by three voices: a legal one, a psychological one, and an ethical one. The Tribunal is not a quality check. It is a separation of powers.
The model for the Tribunal is not moderation; it is Madison. Any single evaluative perspective, applied unopposed, eventually captures the system. A psychologist-only filter becomes saccharine. A legal-only filter becomes evasive. An ethics-only filter becomes preachy. The three voices are constructed to disagree with each other — and the response that emerges is the one that survives all three.
Each reviewer sees the same draft response and the same context packet but has a different loss function. Conflicts are not averaged; they are surfaced. When the Tribunal cannot agree, it sends the response back for revision with a structured objection. This is the system's analogue of prefrontal executive control: a gate through which nothing passes without deliberation.
A single reviewer is a single point of failure, and — worse — a single point of ideological drift. Three reviewers with distinct priors make drift legible. When two agree and one dissents, the dissent is recorded; over time, the pattern of dissent is itself data about the system's behaviour.
These principles are not style rules. Each is a position the architecture has taken on a question about which contemporary AI disagrees with itself.
Knowledge lives in PostgreSQL, Redis, the graph, the web. The model asks the right questions and distils the answers.
The model extracts natural language. Python calculates dates, scores, similarities, decays. Deterministic when possible.
Hallucination is reduced by trimming competing context, not by louder instructions. When everything screams, nothing is heard.
Dropping fields mid-pipeline is silent data loss. Every node receives the full picture and selects what it needs.
The Router identifies the domain. The Agent classifies the action. Each role is narrow enough to be correct.
The Enricher once held a thousand lines: it loaded data and computed emotional intelligence. Splitting it into Enricher (I/O only, ~400 lines) and EI-Calc (pure functions, ~600 lines) made both testable, reproducible, and reusable — the same EI-Calc now runs for the user synchronously and for Nova asynchronously.
A Router that cannot see the session, the memory, or the affective context must guess. The cost of guessing is the cost of mis-routing. Thalamocortical loops in the brain are bidirectional; in Novaberg, the Router reads what the Enricher has just written. Context-dependent turns resolve correctly as a consequence.
Ebbinghaus decay performs the garbage collection. What is not reinforced fades. What survives becomes personality.
An urge the system cannot satisfy in the foreground is routed to the background. The urge is information, not an error.
A system that learns from a user, over years, cannot also be a system whose operator is a third party. Memory that is not local is not memory.
Novaberg is not a demo. It is a working system designed to run, continuously, on commodity desktop hardware — and to keep running when the internet does not.
| Runtime | Python · FastAPI · LangGraph · Docker Compose |
|---|---|
| Storage | PostgreSQL + pgvector · Redis |
| Models | Ollama · Gemma 4 · Qwen3 · nomic-embed — all local, all CPU/GPU-resident |
| Interface | GTK4 · WebKitGTK — native desktop, no browser chrome |
| Search | SearXNG — self-hosted meta-search, no third-party API |
| Reference hardware | Ryzen 9 7900X3D · Radeon 7900 XTX (24 GB VRAM) · 64 GB DDR5 · Nobara Linux |
| External calls | 0 required. The system is fully operational offline. |
| Telemetry | None. User data never leaves the machine on which it is generated. |
| License | Apache 2.0. The architecture is open; the knowledge belongs to the user. |