Full text loading...
and Václav Jonáš Podlipský1
Abstract
We conducted a comparative reception analysis of Spanish‑to‑Czech translations by OpenAI’s GPT‑3.5, GPT‑4, and DeepSeek‑V3 across two text domains (marketing and literary), two evaluation criteria (naturalness and grammar), and two prompting strategies (simple vs. detailed). Additionally, the consistency of these translations was assessed using the Levenshtein distance metric. A reading task in which 132 Czech native speakers gave ratings of the translations on a 5‑point Likert scale revealed that, contrary to previous findings, simple prompts produced reliably higher‑quality translations than detailed prompts. Next, literary translations were rated lower than marketing translations, and grammar ratings exceeded those for naturalness. Additionally, GPT‑4 outperformed the other two models on the literary translations only. Finally, DeepSeek‑V3 showed greater consistency but lower quality of literary translation, suggesting that increased consistency may be at the expense of creativity. These findings provide empirical insights into how prompting strategy, text type, and model choice may influence machine translation quality.
Article metrics loading...
Full text loading...
References
Data & Media loading...