Feature/docs#10
Conversation
Valenzione
left a comment
There was a problem hiding this comment.
Overall ok, doc coverage is great
Tow major points:
- Move API connected documentation to OpenAPI spec, it'll make dev-doc.md clearer and more concise.
- There is a lack of user perspective right now in user-doc.md. Overall we need to provide a clear message for each feature of hydro-vis, why we created it, and when\in which situation the user might want to use it. In the current state of doc, it's more about "What hydro-vis is" and not about "Why we did so"
keep up with good work!
| # Why to use and what it does | ||
|
|
||
| Visualization of embedding space of your model can bring you various insights about your data and model performance. | ||
|
|
||
| Embeddings are low-dimensional, learned continuous vector representations of discrete variables | ||
|
|
||
| embeddings can be used to: | ||
|
|
||
| - find nearest points (points that your model considered close to each other) | ||
| - detect domain drift | ||
| - detect data where model makes mistakes | ||
| - detect closes counterfactual - points that are close to each other but are classified by model as different | ||
|
|
||
| Lets see what information our service provides: | ||
|
|
||
| - Visualization of all production requests embeddings with various colorings: | ||
| - Colouring based on model prediction | ||
| - Colouring based on model confidence in predictions | ||
| - Colouring based on scores of your monitoring models | ||
| - Closest requests to specific request | ||
| - Closest counterfactuals to specific request | ||
| - All information about request |
There was a problem hiding this comment.
This paragraph answers questions "What for embeddings can be used" and "what is hydro-vis" but still no direct, concrete answer to "Why hydro-vis"
There was a problem hiding this comment.
About openApi, it is already here, in doc I specify it in very begining
| embeddings can be used to: | ||
|
|
||
| - find nearest points (points that your model considered close to each other) | ||
| - detect domain drift |
|
|
||
| - find nearest points (points that your model considered close to each other) | ||
| - detect domain drift | ||
| - detect data where model makes mistakes |
| - find nearest points (points that your model considered close to each other) | ||
| - detect domain drift | ||
| - detect data where model makes mistakes | ||
| - detect closes counterfactual - points that are close to each other but are classified by model as different |
There was a problem hiding this comment.
Counterfactuals are calculated, not detected. Still, no clear understanding of why we might want to look at counterfactuals
| Lets see what information our service provides: | ||
|
|
||
| - Visualization of all production requests embeddings with various colorings: | ||
| - Colouring based on model prediction |
There was a problem hiding this comment.
We provide such coloring to solve which problem?
There was a problem hiding this comment.
yes, based on returned class and confidence
| - Colouring based on model confidence in predictions | ||
| - Colouring based on scores of your monitoring models |
There was a problem hiding this comment.
Same goes for these two. It's "What" but not "Why"
|
|
||
| ## 1. Create Model and Application | ||
|
|
||
| Create your model, which will receive some inputs and return outputs which contain field `embedding`. Embedding should be a 1 D vector. Upload your model using command `hs upload`. |
There was a problem hiding this comment.
Minor comment - I'd rather use shape notation in form of tuple rather than "1 D vector".
| ``` | ||
|
|
||
|
|
||
| # API |
There was a problem hiding this comment.
We can omit this section in this doc, since it's described thoroughly in OpenAPI spec
There was a problem hiding this comment.
Yes, it is described but OpenAPI does not have additional information, here I add some comments when to use specific requests
|
|
||
| visualization_metrics - metrics that are used to evaluate how good will visualization reflect your real multidimensional data in 2D/3D plot. More on visualization metrics you can find [here](#visualization-metrics) | ||
|
|
||
| possible visualization metrics: |
There was a problem hiding this comment.
It's better to put it into OpenAPI spec
There was a problem hiding this comment.
|
|
||
| Returns state of a task and result if ready | ||
|
|
||
| states: = ['PENDING', 'RECEIVED', 'STARTED', 'FAILURE', 'REVOKED', 'RETRY'] (Source: [Celery Docs](https://docs.celeryproject.org/en/latest/reference/celery.states.html#all-states)) |
There was a problem hiding this comment.
same, put it into OpenAPI spec
# Conflicts: # transformation_tasks/tasks.py
# Conflicts: # README.md # openapi.yaml # transformation_tasks/tasks.py
Added user and developer docs