Marklogic Interview Questions and Answers

Welcome to the world of MarkLogic interview questions and answers. Whether you are a seasoned professional or a curious enthusiast, this comprehensive collection of MarkLogic interview questions and answers will help you prepare for your following MarkLogic interview. From fundamental concepts to advanced topics, we will cover various subjects, including database architecture, data modeling, indexing, querying, and more. So, get ready to delve into the world of MarkLogic and equip yourself with the knowledge and confidence to tackle any interview question that comes your way. Let’s dive in!

Q1. What are the main features of MarkLogic?

Answer. MarkLogic has several use cases and important features, but we have only listed the most important one below.

  • Advanced Security on user role base
  • ACID Transactions
  • Built-in Search
  • Indexing
  • Search and query

Q2. What is Group in MarkLogic?

Answers: Before knowing the Group in Marklogic, we have to know what is the Host in Marklogic.

HOST: A host is a computer system or server on which the MarkLogic server is set up or functioning..

Group: A group is a set of hosts which are running within a single cluster.

Cluster: A combination of multiple hosts that work together within a cluster for any server.

Q3. What is the difference between document-insert and document-load functions?

Answer: We use the xdmp:document-insert for inserting the document in the DB that was created on the fly. While the xdmp:document-load function loads the document into MarkLogic directly from your file system.

Q4. How to extract results from MarkLogic?

Answers: We extract the result from the Marklogic based on our requirements. When we need documents based on a particular element, we use cts:search and search: search APIs.
Also, we can use the FLOWR expression to extract the result. 

F:     for
L:    let
O:    order by
W:   Where
R:   return

Or, when we need a bulk operation to perform on data we use MLCP/CoRB tool for import and export data.

Q5. What are the collections and directories in Marklogic?

Answer: In a simple way, a Collection is a group of documents.  Collections are used to organize documents in a database.  

You cannot set properties on a collection.

Directories: You can also use directories to organize documents in a database.  Directories are hierarchical.

You can set properties on a collection in a directory.

Q6. If you want to search on a document level how you can achieve it?

Answers: We can use cts:search to search on documents.
Example: 

cts:search
(
fn:doc('doc_name.xml'), 
cts:element-value-query(xs:QName('city'), 'Delhi')
)

Q8. How do you add a collection name in Marklogic on any file?

Answer: We can use xdmp:document-add-collections() function to add a collection name in any file. We need to provide their two parameters, first is URI of the document, and second is the collection name
Example

xdmp:document-add-collections("books.xml", "bookstore")

Here bookstore is a collection name.   

Q9. How do you add metadata in MarkLogic on Document?

Answer: We use xdmp:document-put-metadata($uri, $metdata as map:map) function to add metadata to the document. But this function is not available in ML-8, it comes in ML-9. 
In ML-8, We can add properties on the document through xdmp:document-add-properties($uri as xs:string, $props as element()*).

Q10. How you appilied cts:search on Json Documents?

Answers: Yes, we have applied the cts:search on the JSON document with the help of cts:j* query. 

cts:search(/, cts:json-property-value-query("city", "Delhi"))

JSON document:

{
  "detail": {
    "name": "sonoo",
    "salary": "56000",
    "city": "Delhi"
  }
}

Q11. What is search:search API.

Answer: The search:search is an advanced level powerful API in Marklogic search that is used MarkLogic Core cts API internally to search the result. It is a combination of libraries that combines searching, search parsing, search grammar, faceting, snippeting, search term completion, and other search application features into a single API.

The search:search API takes the input in the string format and returns the result based on your options.
Example:

import module namespace search =
 "http://marklogic.com/appservices/search"
    at "/MarkLogic/appservices/search/search.xqy";

search:search
(
 "Amman Koil Kizhakkaalae", 
  <options xmlns="http://marklogic.com/appservices/search">
   <return-results>true</return-results>
     <return-facets>true</return-facets>
    </options>
)

Result:

<search:response
snippet-format="snippet" 
total="2" 
start="1" 
page-length="10"
xmlns:search="http://marklogic.com/appservices/search">
<search:result index="1"
 uri="bollywood_1986.xml"
 path="fn:doc("bollywood_1986.xml")"
 score="83968" 
 confidence="0.629252" 
 fitness="0.6709731">
	<search:snippet>
		<search:match 
		path="fn:doc("bollywood_1986.xml")/dvd/movieName">
		<search:highlight>Amman</search:highlight>
		<search:highlight>Koil</search:highlight>
		<search:highlight>Kizhakkaalae</search:highlight>
		</search:match>
	</search:snippet>
</search:result>
<search:qtext>Amman Koil Kizhakkaalae</search:qtext>
<search:metrics>
<search:query-resolution-time>
PT0.0177147S
</search:query-resolution-time>
<search:snippet-resolution-time>
PT0.0020601S
</search:snippet-resolution-time>
<search:total-time>
PT0.020584S
</search:total-time>
</search:metrics>
</search:response>

Q13. What is the facets and constraint in Marklogic?

Facets provide a way to categorize and aggregate data for search exploration, while constraints define rules and conditions for search queries in order to filter and refine the results. Together, facets and constraints enhance the search capabilities of MarkLogic and enable users to effectively search and analyze large volumes of data.

Facets: Facets in MarkLogic are a way to categorize and organize data for search and exploration purposes. They provide a means to extract and aggregate metadata from the documents stored in the database. Facets can be based on various criteria, such as document properties, element values, or ranges of numeric or temporal values. For example, if you have a collection of books, you can create facets based on attributes like author, genre, publication year, or price. Facets help users narrow down search results by providing options to filter and refine the data.

Constraints: Constraints in MarkLogic are used to define rules and conditions for search queries. They allow you to specify criteria that must be satisfied for a document to be considered a match in a search. Constraints can be applied to various aspects of the data, such as properties, elements, or ranges of values. For example, you can create a constraint that only returns books published after a specific year or documents with a certain property value. Constraints help in enforcing data integrity and ensuring that search results meet specific criteria.

When we use grammar in search: search api, then we need to create facets and constraints.  We can declare constraints as below: 

import module namespace search =
 "http://marklogic.com/appservices/search"
    at "/MarkLogic/appservices/search/search.xqy";
let $option :=
<options xmlns="http://marklogic.com/appservices/search">
<return-results>true</return-results>
<return-facets>true</return-facets>
	<constraint name="movieName">
	<range type="xs:string" facet="true">
	<element ns="" name="movieName"/>
	</range>
	</constraint>
</options>
return search:search("movieName:Jahanara", $option)

Result:

<search:response
 snippet-format="snippet"
 total="4"
 start="1"
 page-length="10">
<search:result 
	....
</search:result>
<search:result 
	.....
</search:result>
<search:facet name="movieName" type="xs:string">
	<search:facet-value 
	name="Jahanara" 
	count="4">
	Jahanara
	</search:facet-value>
</search:facet>
<search:qtext>movieName:Jahanara</search:qtext>
<search:metrics>
<search:query-resolution-time>
PT0.0035577S
</search:query-resolution-time>
<search:facet-resolution-time>
PT0.0057432S
</search:facet-resolution-time>
<search:snippet-resolution-time>
PT0.00118S
</search:snippet-resolution-time>
<search:total-time>
PT0.0112312S
</search:total-time>
</search:metrics>
</search:response>

Here, a constraint is defined with facets in the options and we passed it into the search API as ConstraintName: Value.

Note: To define the facet, a range is required for that element in the database while for constraint it is not mandatory.

Q14. What is a transaction in MarkLogic?

In MarkLogic, a transaction refers to a unit of work that involves one or more database operations, such as inserting, updating, or deleting documents. Transactions provide a way to ensure data integrity and consistency when multiple operations need to be performed together as an atomic operation.

MarkLogic supports ACID (Atomicity, Consistency, Isolation, Durability) transactions, which guarantee that either all the operations within a transaction are successfully completed, or none of them are applied.
There are two types of transactions in Marklogic:

Single statement transaction means Automatically Committed Transactions

Multiple statement transaction means Explicitly Committed Transactions

Q15. Which types of nodes are in XQuery?

Answer: In XQuery, there are seven kinds of nodes:

  1. element
  2. attribute
  3. text
  4. namespace
  5. processing-instruction
  6. comment
  7. document (root) nodes.

Q16. How do you search for a document based on a key that is available in both documents?

Answer: Yes, We can search a document based on the key in the MarkLogic. Simply we will write a cts:search query as below:

cts:search
(
fn:collection('abcd'), 
cts:element-value-query(xs:QName('key'), 'value')
) 

or

cts:search
(
fn:collection('abcd'), 
cts:element-query(xs:QName('key'), cts:true-query())
)

Q.17. What is ingestion and Curation in MarkLogic?

Answer: Ingestion is the process to insert the data into MarkLogic as is in raw format. Curation is the process to manipulate your data into your system schema format based on the mapping.

Q.18. What difference between (=) sign and (eq) in the query?

Answer: There is no major difference between the ‘=‘ and ‘eq’ comparison operator in MarkLogic. Both signs are used for the comparison but the basic difference is that (eq, lt, gt) operators are used for comparing single values, and the (=, <, >) sign is used for the sequence values.

xquery version "1.0-ml";
(:example for = sign:) 
let $a := "abc"
return 
   if($a = 'abc') then "ok" else "not ok"
(:example for eq sign:)
let $a := ("abc", "bcd", "cdb")
return 
   if($a eq 'abc') then "ok" else "not ok"
Result: both query will return the "ok"

Q. 20. What is the difference between fn:contains() and cts:contains() in MarkLogic?

Answer: The fn:contains is used for the checking containing the text in the string. while cts:contains() is used for the query matching on a node in the DB.

Q.21. What is filtered and unfiltered search in MarkLogic?

Answer:  The filtered and unfiltered are the features of cts:search. The filtered search fetches the result from indexes and validates it in documents. But unfiltered search only fetches the result from indexes but does not validate in the document so the unfiltered result may be not accurate sometimes. But unfiltered searches are fast compared to filtered searches. By default filtered search is configured in the cts:search query.  

Q. 22. What is the basic difference between xdmp:spawn and xdmp:invoke a function in MarkLogic?

Answer: Before understanding xdmp:spawn-function and xdmp:invoke-function in Marklogic, we need to understand synchronous & asynchronous terms. Also, we need to know what is the main server, task server, and Transactions in the MarkLogic.

So I’m posting here the simple definition of each term that I mentioned here. 

Synchronous and Asynchronous Task:

  • Synchronous: In synchronous operations tasks are performed one at a time and you can’t move the following task until the current task is finished.
  • Asynchronous: In asynchronous operations, you can move to another task before the previous one finishes.

Now,  We need to know what asynchronous functions & asynchronous functions in MarkLogic are:

Asynchronous Functions:

  • xdmp:spawn: This function will place a module on the task queue for processing. If a task server has the capacity to process this module, it will be assessed. The task is processed according to order of entry in the queue.
  • xdmp:spawn-function: The xdmp:spawn-function function puts a particular function into a task queue for processing. The function will run when the task server has the resources to do it. Tasks are processed in the order they’re added to the queue.

Once both the above functions are called, these cannot be rolled back, even if the transaction from which it is called does not complete. 

Synchronous Function:

  • xdmp:eval: Returns the result of evaluating a string as an XQuery module
  • xdmp:invoke: Returns the result of evaluating an XQuery or Server-Side JavaScript module at the given path.
  • xdmp:invoke-function: The XQuery version of this function (xdmp:invoke-function) can only be used to invoke XQuery functions. The Server-Side JavaScript version of this function (xdmp.invoke function) can only be used to invoke JavaScript functions.

Q23. How to measure query Xquery performance in Marklogic?

Answer:  First, we can analyze our query by profiling the meter that is given on the Console page. It completely shows line-by-line execution time.  Here you can able to see line-by-line execution time. Based on this execution time you can re-design or tune your query. 

Second, we use the xdmp:query-meters and xdmp:query-trace functions to understand and tune the performance of queries. 

Q.24. What is E-node and D-node in the MarkLogic?

Answer: Evaluator node (e-node):  E-node that evaluate xquery programs, XCC/XDBC request, and other server request also. E-node also talk with D-node if needs forest data otherwise e-node request evaluated entirely on the e-node.

Example-1: Suppose you run query to do some mathematical calculation and you have no need of forest data, in this case query run entirely on E-Node. Query will not talk with D-node in this case.  

Example-2: Suppose you want extract some forest data, in this case you will run a query on E-node and E-node send a request to D-node for fetching data. D-Node fetch the data and return to E-node. 

Data node (D-node): 

D-nodes are responsible for maintaining transaction integrity during insert, update and delete operations.  Or we can say D-node is responsible for only data.

Example: if you want to import data without using of E-node, in this case you use to MLCP for this. MLCP work directly on D-node to import or export data. 

Q.25. What are universal index and range index?

Answer: The universal index is the indexing process of MarkLogic that Marklogic performs automatically when data is loaded into MarkLogic.

By default MarkLogic indexing below universal index:

  • Word Indexing
  • Phrase Indexing
  • Relationship Indexing
  • Value Indexing
  • Word and Phrase Indexing

Inverted index:

An inverted index inverts the document-word relationship into a word-documents relationship. 

Each entry in the inverted index is called a term list

Range Index: 

Ranges index is also an indexing process in Marklogic but this index is created by the developer based on project requirements.

When we need a fast result in the project, we create ranges indexes.  

  • Path range index
  • Element range index
  • Element attribute range index
  • etc…

In simple, universal index creates automatically without introduce any index at time of ingestion data while range index needs to configure first in Marklogic.