Generated Images

Result

gpt-image-2

Education & Research

Research Figure

Transformer Encoder Decoder Architecture Machine Learning Conference Style Academic Concept Figure

1 image1 prompt versions

Prompt Text

Prompt

16:9 landscape academic technical diagram of the Transformer encoder-decoder architecture, clean machine-learning conference camera-ready publication style, clean professional LaTeX Computer Modern typography, high contrast monochrome with soft light gray and pale blue accent colors for clarity, no background clutter, sharp lines, fully legible text.
Layout: Two equal vertical column stacks placed side-by-side, separated by a thin light gray dashed vertical divider line.
Left column details: Bold header at the top of the left column reads "ENCODER (xN)". Blocks are stacked vertically from bottom to top in the following order: 1. "Input tokens" solid rectangular block, 2. "Input Embedding" solid rectangular block, 3. "+ Positional Encoding" solid rectangular block, 4. Dashed-outline rectangular block labeled "Encoder layer" containing internal stacked sub-blocks from bottom to top: "Multi-Head Self-Attention", "Add & Norm", "Feed-Forward", "Add & Norm", with thin curved pale blue residual arrows looping around each individual sublayer connecting the sublayer input to the output of its corresponding Add & Norm step.
Right column details: Bold header at the top of the right column reads "DECODER (xN)". Blocks are stacked vertically from bottom to top in the following order: 1. "Output tokens (shifted right)" solid rectangular block, 2. "Output Embedding" solid rectangular block, 3. "+ Positional Encoding" solid rectangular block, 4. Dashed-outline rectangular block labeled "Decoder layer" containing internal stacked sub-blocks from bottom to top: "Masked Multi-Head Self-Attention", "Add & Norm", "Multi-Head Cross-Attention", "Add & Norm", "Feed-Forward", "Add & Norm". A thick solid horizontal arrow points from the top of the left encoder stack directly to the "Multi-Head Cross-Attention" sub-block, with small adjacent text label reading "keys, values". Above the decoder layer stack, three consecutive solid rectangular blocks are stacked vertically, labeled "Linear", "Softmax", "Output probabilities" from bottom to top.
Centered at the very top of the full diagram: Large bold main title reads "Transformer: encoder-decoder with multi-head attention". Centered directly below the main title: Smaller italic subtitle reads "original academic concept figure". Consistent uniform spacing between all blocks, no visual noise, high resolution suitable for academic presentation.