Single vs Multi-Agent

AI & HCI

에이전트를 여러 개로 쪼갠다고 항상 더 똑똑해지는 건 아니다

Posted on July 4, 2026, 3:50 p.m. by SANGJIN

Before scaling an agentic workflow into a multi-agent pipeline, it's worth asking whether the added complexity actually pays off in token cost.

Single-agent overhead accumulates vertically — context grows turn by turn, but prompt caching keeps reuse efficient. Multi-agent overhead spreads horizontally — each sub-agent carries its own context, duplicating shared background information, paying for inter-agent communication as tokens, and reloading role definitions on every call. Cache efficiency also drops because each agent invokes a different role and prompt, breaking the cache chain that single-agent flows benefit from.

Output format matters just as much as agent count. Plain text and markdown are cheap because they're written directly. Office documents (.xlsx/.docx) cost far more — they're generated through code, then often rendered to images for verification, and a single image can cost as much as hundreds to thousands of text tokens. Likewise, single-shot prompting is cheap and predictable, while agentic execution loops re-include the full prior context on every iteration, so cost compounds with each retry.

Key takeaways:

Multi-agent is not automatically better — it trades token cost and cache efficiency for parallelism and context isolation

Output format is part of cost design — don't reach for a heavier format or an agentic loop when a simple text output would do

Avoid unnecessary structural complexity — validate with a single-agent flow first, then split only when the task is genuinely parallelizable

Agentic Workflow를 멀티에이전트로 확장하기 전에, 그 복잡도가 토큰 비용 대비 실제로 이득인지 먼저 따져볼 필요가 있다.

단일 에이전트의 오버헤드는 세로로 누적된다 — 턴이 늘수록 컨텍스트가 길어지지만 Prompt Caching으로 재사용 효율은 유지된다. 멀티에이전트의 오버헤드는 가로로 퍼진다 — 각 서브 에이전트가 자기만의 컨텍스트를 가지면서 배경 정보가 중복되고, 에이전트 간 통신 자체가 토큰으로 소모되며, 역할 정의가 매 호출마다 다시 로드된다. 게다가 매번 다른 역할·프롬프트를 호출하니 캐시 체인이 끊겨 캐시 효율도 떨어진다.

출력 형식도 에이전트 개수만큼 중요하다. 텍스트/마크다운은 직접 쓰는 구조라 저렴하지만, 엑셀·워드(.xlsx/.docx)는 코드로 생성한 뒤 이미지로 렌더링해 검증하는 과정을 거쳐 훨씬 비싸다. 이미지 한 장이 텍스트 수백~수천 토큰에 맞먹는다. 마찬가지로 프롬프트 한 번으로 끝나는 실행은 비용이 예측 가능하지만, 반복 루프를 도는 에이전트 실행은 매 반복마다 이전 맥락 전체를 다시 포함시켜 재시도할수록 비용이 누적된다.

핵심 정리:

멀티에이전트가 항상 정답은 아니다 — 병렬성과 컨텍스트 격리를 얻는 대신 토큰 비용과 캐시 효율을 대가로 지불한다

입출력 형식도 비용 설계의 일부다 — 텍스트로 충분한 결과를 굳이 무거운 형식이나 에이전트 루프로 처리할 필요는 없다

불필요한 구조적 복잡도를 피하자 — 먼저 단일 에이전트로 검증하고, 진짜 병렬화가 필요할 때만 나눈다

Agent

Leave a Comment: