corpora.ai: Tips and Tricks Pt. 1 of...
This is our first article in what will become a series of helpful guides on how to leverage Corpora.ai to trivialize and democratize deep, meaningful research into a variety of subjects, individuals and disciplines.
Prompts, Functions, Insights
Learn how to truly become a corpora.ai master
This is our first article in what will become a series of helpful guides on how to leverage corpora.ai to trivialize and democratize deep, meaningful research into a variety of subjects, individuals and disciplines. We will cover the introduction of our pseudo query language, corpora.ai Prompting Language (CPL), as well as some of the functions and features that are now exposed to the user via CPL.
For any links to examples, you will require a corpora.ai account. If you don't have one, you can register for access and will be invited to create your account after a short review process.
Tips covered in this article- Controlling Source Content by group
- Controlling Source Content by date
- Comparison Queries
- Controlling Authored Language
- Controlling Output Persona
Our first tip/trick is Controlling Source Content by group:
To understand the value of this tip will take a bit of context and a light explanation of the core engine and data corpora.ai has. corpora.ai has a core engine which ingests in real-time, and currently possesses understanding and local access to over 1 Petabyte of compressed content. This patented and proprietary engine understands contextual grouping of both content, author and publisher allowing corpora.ai to maintain a current and past view of 'the world' based on author and publisher taxonomical grouping. All publishers and authors can possess multiple classifications, i.e. a political publisher can also be a news publisher.
With CPL, users can define the source content for their research in a variety of ways, the 2 key methods being as a function call or through natural language - there are a handful of alias functions, and keywords to use in both methods. The list below details the alias functions.
Source Content Functions
- using()
- focus()
- from()
- with()
- content()
- source()
The keywords are the functions without the parentheses but also can be far more natural. Either can be used at any logical natural position within the query. i.e. using(tech, news)..., How has EV technology evolved from legal. Both of these examples will filter the source content to only publishers and authors that have any of the classifications in either the tech and news or legal groups.
The source content groups are shown in a list below:
- tech
- legal
- medical
- finance
- financial
- news
The above list of source content groups is the current list and they will be updated over time. We will write blog posts about those updates when they are released.
The idiom "A picture paints a thousand words" is always true in my experience, so below is a collection of links to example overviews using a mix of content source filter functions and keywords supported by CPL.
- "What has been discovered about the deep amazon from(news)"
- "Using(medical) what correlation is there between hydrocephalus and cutis aplasia"
- Innovations in finance and trading focus(financial)
The second tip/trick in this article is Controlling Source Content by Date
Users can control the timeframe of the source content that their research is built upon through various natural language mechanisms. This is very powerful as it allows the user to compare the known information of an entity or hypothesis with the same view a year prior.
Users can use the following keywords for date filtering - at the time of writing, there are no date filter functions:
- before
- after
- during
- between
- in
- until
- upto
- prior
- range
The above keywords can be used in any logical and natural position within a corpora.ai query. i.e. What was the political landscape like before 1939, Clinical trials targeting peritonial mesothelioma between 2010 and 2019
Below are some links to queries utilizing the Date Filtering functionality of CPL:
- What role did the church play in legislation in the UK before 1900
- What impact did the suez canal have on the global economy during(2021)
- How did the japanese economy perform range(1960, 1990)
The third tip/trick in this article is Comparison Queries
Comparison Queries are queries that expose the discovery aspect of corpora.ai perfectly. Given a topic, competitors are identified that meet any other provided criteria, and then begins the construction of multiple research reports on each entity, using shared metrics to focus on, which aids user comprehension as the comparison is consistent. The only other CPL function not supported concurrent with Comparison Queries is the Content Source Filtering.
This is a contextual comparison, so users can use the following keyword phrases to initiate comparison queries and build multiple research books:
- find competitors of x
- ...described as x
- products for x
- ...that are described as x
- ...that compete with x
- challengers to/of x
- alternatives to/of x
- replacements for x
- rivals to x
- opposite to x
- antagonist to x
- slanderer of x
- allies of x
- similar to x
- associates with x
- supports x
The reason for such variance in the keywords is that in affords the greatest flexibility to the user.
Below are links to example queries that utilize the Comparison Query processing functionality of corpora.ai:
- List of pain killers described as safe for ckd patients
- Find countries described as tax havens
- Find competitors of vietnam described as having similar economies
The next tip/trick to cover in this article is Controlling Source Content by Language
Users can also control the source content language their research is built upon through CPL functions while also controlling output language. For example, a user can use the following structure to their query: written(x, y) in z where x and y are source languages and z is output language
This is very powerful as it gives the researcher the ability to view differing regional content and understanding of various topics. Below is a list of the functions that can be used to achieve this function.
- output
- in
- use
- report
- language
- authored
- written
- composed
- published
To demonstrate the above, the examples all show a mix of function use.
- How has the chinese economy grown written (zh, ja) output english
- How has global warming affected russia authored(ru)
- Desertification prevention in northern africa composed(ar)
The fifth and final tip/trick to cover in this article is Controlling Output Persona
The output persona is the control to choose the language used in the research book that is generated. Persona is a function and has alias functions too, these are detailed below:
- persona
- audience
- reader
- recipient
- listener
These functions take in one parameter which is natural language to succinctly describe the target audience. E.g. audience(high school), reader(phd student), persona(professional). Each of those examples will control the output construction process used to author the research books and ensure the content is appropriate for the provided persona.
The links below show this in practice.
- 737-max avionics and acas persona(experienced,commercial,pilot)
- Segmentation fault prevention audience(engineer, mathematician)
- Understand the comparison between dry belt and wet belt engines reader(mechanic, engineer)
We hope these Tips and Tricks are helpful and unlock your research potential with corpora.ai. Our intention is to continue to grow and evolve the platform to handle more configuration to user queries through the CPL.
Feel free to let us know if this article and type of content is helpful. We welcome your feedback and accept feedback via email at support@corpora.ai, X @corpora_ai, LinkedIn or any other of our social platforms as well as the comments below.
Comments ()