Generative AI

What LinkedIn learned leveraging LLMs for its billion users


Evaluation challenges

Another functionality goal LinkedIn had for its project was automatic evaluation. LLMs are notoriously challenging to assess in terms of accuracy, relevancy, safety, and other concerns. Leading organizations, and LLM makers, have been attempting to automate some of this work, but according to LinkedIn, such capabilities are “still a work in progress.”

Without automated evaluation, LinkedIn reports that “engineers are left eye-balling results and testing on a limited set of examples and having a more than a 1+ day delay to know metrics.”

The company is building model-based evaluators to help estimate key LLM metrics, such as overall quality score, hallucination rate, coherence, and responsible AI violations. Doing so will enable faster experimentation, the company’s engineers say, and though LinkedIn’s engineers have had some success with hallucination detection, they haven’t been able to finish work in that area yet.



Source

Related Articles

Back to top button