Calculation models: artificial intelligence - large language models

This companion document is to be used in-line with the principles set out in the main document "Calculation Models; guidance for the use of software for engineering calculations" and will make references to that main document.

While this companion can be followed independently, the reader will benefit from understanding the language and terminology used in the main document.

Large language models (LLMs) for structural engineering: an introduction

There are three main types of LLMs relevant to structural engineering, each with different benefits and limitations:

Web-based LLMs

These are accessed online and offer the most advanced models and broad knowledge. However, they raise data privacy concerns and are best used for general guidance, not confidential project data.

Avoid sharing any confidential, personal, or project-sensitive information. Even if the provider claims not to store your data, it may be used for service improvement or logged for a period. Treat these like public platforms.

Integrated LLMs

These are built into design tools or corporate platforms. They assist with tasks like drafting reports or code generation within familiar software but may still process data externally, requiring careful oversight.

Before sharing any confidential or personal information you should check your organisation’s agreement with the vendor. Many integrated tools still use external servers (e.g. Microsoft Copilot using Azure OpenAI), so while more secure than public tools, they’re not always private by default.

Siloed LLMs

These are fully contained within an organisation’s infrastructure. They offer maximum security and control, suitable for sensitive projects, though may lack access to the latest training data or features.

These are the safest for sensitive data because they're hosted entirely within your organisation. You still need to check your internal data governance, but nothing should leave your environment if the system is properly configured.

Using LLMs to produce work

This guidance is largely focused on how to use and check work created with LLM assistance.

Below are six principles structural engineers should follow when using AI language models to assist in the production of calculations.

1) Set reasonable expectations for the LLM’s abilities

Treat LLMs as an inexperienced junior engineer. This means that you should expect it to get answers wrong and ask for help if it needs it. Unfortunately, LLMs are overly confident and unlikely to ask for help. Remember, without experience even the smartest graduate could not deliver a whole project.

2) Put the LLM in context

Begin your prompt with the context of what you are looking at. This will help the AI to find resources that are more specific to your problem. It would be sensible to tell the AI:

Field: tell them they are a structural engineer,
Location: what country the structure is in (for relevant codes), adding the city may be beneficial,
Typology: the type of structure and materials it's constructed from.

And if working on a historic structure add:

Age: the time period it was originally constructed in,
Alteration: the main reason for the alteration being done to it.

Example:

3) Break the task into steps

For calculations this may be asking it to provide the formula (with references) for a single aspect of design. Eg checking moment resistance of a beam, or the shear capacity of a column. Don’t ask it to give the process to design every aspect of the element in one go.

For reports this means one short section or paragraph at a time. Don’t ask it to provide a full report, unless purely for idea generation, where it can be treated more as a partner to help generate ideas.

4) State clearly what the output should look like

When asking someone in a design office to undertake a piece of work, there are conventions and experience so the format of the output is implied and does not need to be stated. However, LLMs do not have that implied logic, they need to be told.

For calculations they need to:

Clearly reference design standards including section, clause and equation numbers,
Provide the formulas to the equations,
And provide definitions for all the input and output parameters.

For report writing:

The information should be cited from credible sources such as officially published design guides produced by standards organisations or professional institutions (eg IStructE),
State clearly whether academic literature should be considered and included,
If you are providing reports to the LLM to reference, state clearly which sections you believe are useful and relevant.

You may wish to provide an explicit list of which standards and guides it should be using.

5) Tell the LLM what responsibilities remain human

LLMs are powerful tools, but at the current time they are not capable of reliably performing calculations. It is important to tell the AI what its responsibilities are.

In their own words, here is the result of asking an LLM: “How reliable are LLMs at doing calculations? And can you trust them?”

The response generated is as follows:

LLMs are not inherently reliable at performing calculations—you should not fully trust them without verification.

Here's a breakdown:

What LLMs are good at (with calculations):

Explaining calculation methods (e.g. how to use a formula).
Writing code to perform calculations (Python, Excel formulas, etc.).
Describing standards and structural design procedures if the references are real and known.
Giving examples with illustrative numbers for clarity.

Where they can go wrong:

Arithmetic mistakes, especially with long or multi-step calculations. LLMs don't "calculate"; they predict text.
Fabricating fake formulae or non-existent clauses.
Misapplying units or using inconsistent dimensions.

Being overconfident in incorrect answers—they sound right even when they’re wrong.

~Generated using ChatGPT (GPT-4), developed by OpenAI. Output verified and curated by IStructE, June 2025.

With this in mind you should tell the LLM, not to perform the calculations only to provide the guidance for the process. State the process itself should be worked through by a human. It will help to provide an example of how you would like to see the information formatted.

6) Review and challenge

Check whether the logic is correct, and whether the references are real. Challenge it to refine, correct, and expand responses.

Note:

If you are asking design-related questions, you should always have a copy of the design code or standard on hand and should check that the equations it has provided are in the code. Often the LLM will hallucinate and produce equations which do not exist.

If you are asking questions related to the analysis of a structure, you should ask for references for the equations and perform simplified checks to confirm the analysis results are correct.

Example of a well setup prompt:

The expectations for what you wish the LLM to produce should be set in line with Principle 1. Here you are not expecting the LLM to do the work for you, but to help reduce your time and to find the references that you are looking for.

Checking work produced with LLMs

A checking engineer should treat the work produced by an LLM as though it has been produced by a graduate.

For documents The checking engineer should review them as they would any other report or drawing produced by a graduate engineer.

For calculations The checking engineer should be familiar with the process of any calculation, and should review that the process has been done correctly.

However human an LLM seems, it’s important to remember they are just a piece of software. The person responsible for the quality of the work is the person who used the LLM to create the work. The people who create the work, no matter which tool they use, are responsible for the output.

Using LLMs as checking engineers

This section can be summarised as: don’t expect the AI to do the checks for you.

Checking engineers should treat the output from LLMs in the same way they would treat the work of another engineer. It cannot be assumed that it will be correct.

The Design Manual for Roads and Bridges (DMRB) is the UK’s primary technical standard for the design, assessment and operation of motorway and trunk road infrastructure, and suggest the below checks:

Category 0 and 1 checks

There has been recent speculation of LLMs being used to check whether structural designs are “correct” without any human review. This is strongly discouraged. The Institution’s position is that LLMs cannot currently be relied upon as independent checkers. They lack the necessary experience, judgment, and contextual understanding to recognise subtle but serious errors.

LLMs may be used by the design engineer themselves to assist with:

Grammar or formatting checks,
Spotting omissions or inconsistencies in their own reasoning.
However, once a piece of work is issued for review, it must be checked by a competent human chartered (or professionally licensed) engineer who accepts responsibility for its correctness.

Category 2 and 3 checks

DMRB Category 2 and 3 checks require independence not only in people, but also in process and tooling.

The LLM used during design (including any commercial models, private APIs, or custom company tools) must be clearly recorded as part of the checking handover documentation. The checking engineer or organisation should not use the same LLM, or any tool based on the same underlying model, to support their independent analysis or review.

This is particularly important because:

Many company-built LLM tools are “wrappers” around commercially available engines (eg. OpenAI, Anthropic, etc.).
Even if branded differently, they may behave similarly, leading to common blind spots in design and check stages.

To preserve independence, companies must understand what LLMs their internal tools are based on, and ensure software separation between designer and checker as would be expected with any other software used as part of the structural analysis or design.