Dual
Learn more by reading the official write-up.
Installation (currently only available from source)
Download Dual.zip and unzip it in .obsidian/plugins/
. Follow the instructions in the plugin settings tab to continue. Arm yourself with patience!
.. Dual::
.
|-- skeleton
| |-- conversational_wrapper.py
| |-- core.py
| |-- requirements.txt
| |-- server.py
| |-- util.py
|-- essence
| |-- config.json
| |-- pytorch_model.bin
| |-- training_args.bin
|-- main.js
|-- manifest.json
|-- ...
*.bin
Command Samples
Fluid Search
- Find notes about topic.
- Search for entries on topic.
- Look up texts related to topic.
Descriptive Search
- Find a entry which description.
- Search for a note that description.
- Look for a text which description.
Open Dialogue
- question?
torch.embedding IndexError: index out of range in self
This happens sometimes on Windows in open dialogue. It only pops up with specific questions. My working hypothesis is that nasty characters in retrieved notes hurdle the generation process.
issue with executing server
python 3.9 MacOS Catalina 10.15.7
not sure if this is related, but I couldn't find an
essence.zip
folder but I looked in the sample vault on the repo and made a copy of theconfig.json
to the same directory on my other machineServer fail to start
Console output:
Implement argument parsing in new frontend
Based on arguments detected in #52, such as
*person*
or*topic*
, the values have to be extracted from the user query using text generation as describer here. Argument names and the query should go into a function, and a dictionary with the proper value attributions should come out.UnicodeDecodeError in utils.py when running server.py
(Python 3.8.4 on Windows 10 20H2 (64 bit))
When I ran "python server.py --path /path/to/vault" inside my vault directory, I received the following output:
A quick Google search got me to this StackOverflow question: https://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character.
One person suggested specifying the encoding when opening the file. So, inside
skeleton/util.py
, I changedcontent = open(file).read()
(line 8) tocontent = open(file, "utf8").read()
and this solved the problem. But I thought it was worth mentioning anyway as I didn't see anything in the documentation about file encoding.Add Action and move source files
To make it easier to understand for contributors, and generally tidy up a little, these changes move source files into common locations used for developing JS apps/tools.
This is in preparation for a more formal release system.
Error deriveing the essence
Google keeps giving me this error:
Proposal for functionality changes and recipe framework design
Progress towards solving existing issues and setting up a proper roadmap had been slowed in the past days by the fear of prematurely settling on an architecture and API design given that this space of conversational interfaces over personal knowledge bases is quite unexplored.
The following describes a suggestion for heavily restructuring the functionality and the codebase, a tentative something in between a spec and a user story.
Architecture
Dual is based on two components: the backend and the frontend. The backend is a server which exposes two main endpoints:
/extract
, which returns entries from one's knowledge base based on a natural language description, with some options/generate
, which generates text given a prompt, with some optionsHowever, the user doesn't usually interact with the endpoints directly. Rather, they use recipes. Recipes tell Dual how to answer certain commands. They can be predefined, user defined, or contributed by some other user. Recipes are simple Markdown files with the following structure:
If the user has this recipe in their vault as a note, then whenever they ask their Dual that question, they'll get the contents of the note as an answer.
The
pattern
field of a recipe is a regex pattern. It can also house groups, which can then be referenced in the content.With this recipe, if the user tells their Dual
My name is John
, it'll reply withHi there, John!
.All this is cute, but not all that useful or interesting. Among the recipes there's also this predefined recipe:
Now, this is good old descriptive search, expressed as a recipe which makes use of the
/extract
endpoint. When askingFind a note which describes a metaphor between machine learning and sociology
, it'll answer with a list of results based on that GET HTTP call made behind the scene to the endpoint.But if you wanted to customize the command triggers even for this predefined command, you could just wrap a new recipe around it, or change the original one. Here's a wrapper recipe:
Cool, you just made your Dual a bit edgier.
So this is how you can express good old descriptive search and fluid search as recipes. What about good old open dialogue?
Now, when you ask it a question with that structure, Dual assembles the relevant notes in there, composes the prompt further with your query, and then generates the response. Good old open dialogue, but expressed as a recipe. Every command becomes a customizable recipe.
Now you want to teach your Dual to come up with writing prompts, you create this recipe:
You ask it
Come up with a writing prompt
and you get some in return.Sure, there are technicalities. The note contents until the generate call should be piped into it as the prompt. The endpoints are shorthand for localhost:5000/..., but you could perhaps change them to refer to a hosted instance at some point in the future. You could make calls to other people's instances through recipes. You could tap into any API through a recipe, turning Dual in a sort of conversational hub. Regex groups have to be entered when making calls. URL's have to be encoded properly because they contain text. Extract calls should know if to supply filenames or contents, through parameters probably. What should a recipe return, the entire contents or the result of the last call? Perhaps a metadata setting. A bunch of things still to settle on.
CrossEncoder.py IndexError: list index out of range
I have this error
When I debugging, I found this
The Alignment in Colab works fine and I have my essence folder in Dual path
Do you know what fails here? Maybe my structure? I have 20k md files.
I love the use case that Dual brings to Obsidian!
Switch models to GPT-Neo versions
Similar models but with higher performance as they've been trained on more data. Hopefully they're still fine-tunable in a Colab notebook, at least the medium one.
Bundle skeleton in a self-contained binary
Not sure what's the best way to go.
Future of this project?
Hi @paulbricman,
Great project - this is exactly what I was looking for in Obsidian 👏 thanks for all your work so far!
Was curious whether you plan on further developing Dual? E.g., creating an official Obsidian plugin so that other Obsidian developers can help to extend, maintain and further improve Dual.
Cheers and hope you're well!
Not responses generated by Dual other than "typing". No error in server, even in debug mode.
Dual looks fantastic! I'd love to try it out. So far, I have been unable to get it to work. I followed the instruction on the readme and made it all the way to starting the server. However, I'm unable to get any response from my prompts. I don't see any error messages, so I'm not sure what to do.
It looks like there is an error related to a missing header for CORS. I'm not sure what that means exactly, but I hope it's at least somewhat helpful.
May this page is helpful?
Any advice?
RuntimeError: CUDA out of memory
Hey @paulbricman, super cool project! I ran it per instructions, but had issues with the training model:
Text:
I set the settings to medium model (8+gb RAM) and my vault size is listed as "ideal".
Any thoughts on what I can tweak?
Decoding error
While running the skeleton with command
python3 server.py --path "/Users/ophan/Documents/KTCB/KTCB Research/"
getting error:return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 383: character maps to <undefined>
I strongly suspect it's due to the fact I have Spanish characters in my notes. I really don't want to remove them, nor do I particularly want to go back through the previous steps. Is there a way I can get it to decode my Spanish characters? Characters used: ¿¡áéíóúñÁÉÍÓÚ
ModuleNotFoundError: no module named flask
Hey there, when I run the command 'python3 server.py --path [path to skeleton]', it comes up with the error Traceback (most recent call last): File "C:\Users\Owner\iCloudDrive\iCloud~md~obsidian\Orangeo.obsidian\plugins\Dual\skeleton\server.py", line 1, in
from flask import Flask
ModuleNotFoundError: No module named 'flask'
Anyone know how to fix this?
Unable to "Start Alignment"
Hi! I just saw Obsidian plug-in in a Twitter thread and wanted to give it a try. I correctly downloaded/extracted and copied the plugin in .obsidian/my_vault_name/plugin, and I can see Dual settings in my Obsidian option panel. I downloaded/installed Python as per instructions. I created and copied the snapshot of my vault (a not-so-big test vault). But as I try to "Start Alignment" I can't open the webpage linked to the button. Google says I don't have the permissions...
Running macOS 11.5.1 on a MacBook Air 2020 (not M1). I tried with Brave browser, Google Chrome and Safari, all logged in with my Gmail account: no way. I even tried with a Chrome private session: same result.
What am I doing wrong? Thanks!!!