Mastodon - 2024-02-27T01:22:33Z

Mastodon

Been using LangChain and Chat GPT to analyze a corpus of text (https://info.arxiv.org/help/bulk_data.html). I'm taking PDFs, extracting the plaintext per-page and querying for something like "No more than N sentences that reference concept X".

It *seems* to do ok with general sentiment and fails to extract sentences that are related to the concept. It happily returns sentences that a human reader would consider unrelated.

Mastodon Source 🐘

To "debug", I try the interactive chat version to see if I can understand why certain sentences are returned.

It initially respected limiting responses to the input text. It acknowledged its error (?) and understandably couldn't explain why a specific sentence was included.

Mastodon Source 🐘

The refined sentences were not accurate, so I followed up with a request to limit responses to the input text only.

It then started to include sentences that did not exist in the text at all.

Mastodon Source 🐘

Giving up now - maybe there's a "magic prompt" I should be using, but I have no clue how to find it.