LLMs and Emojis [D]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
LLMs are trained on human data, so where does the tendency to add emojis come from?
For example, when some models generate code explanations or even normal responses, they often add lots of emojis that people don’t really use that way in real life.
My current guess (without having researched this yet) is that emojis might sometimes be added after the initial generation process, maybe during post-processing, alignment, or some “reasoning/thinking” stage, rather than being part of the raw generated response itself.
Because intuitively, an emoji doesn’t really behave like a normal word/token inside a sentence or code block.
[link] [comments]
More from r/MachineLearning
-
Improving machine-translated novels via style transfer — looking for advice on the faithfulness/fluency tradeoff [P]
Jul 2
-
How papers are selected for Best Paper, Oral, or Highlight presentation at major ML/CV conferences such as CVPR, ICCV, ECCV, NeurIPS, and ICLR? [D]
Jul 2
-
BMVC 2026 Review Discussion Thread [D]
Jul 2
-
Has anyone tried this approach with Fast Byte Latent Transformers ? [R]
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.