MongoDB¶
The seminar's goal is to understand the concepts of the MongoDB document database and the usage of the MongoDB C#/.NET Driver.
Pre-requisites¶
Required tools to complete the tasks:
- Microsoft Visual Studio 2022
- MongoDB Community Edition
- VSCode
- MongoDB for VSCode extension
- Database initialization script: mongo.js
- Starter code: https://github.com/bmeviauac01/gyakorlat-mongo-kiindulo
Recommended to review:
- C# language and Linq queries
- MongoDB lecture
- Using MongoDB guide
How to work during the seminar¶
The exercises are solved together with the instructor. A few exercises we can try to solve by ourselves and then discuss the results. The final exercise is individual work if time permits.
This guide summarizes and explains the behavior. Before looking at these provided answers, we should think first!
Exercise 0: Create database, open starter code¶
-
Open a PowerShell console (search for PowerShell in the Start menu and start it, but not the one with "ISE" in the same - that is not the console).
-
Copy and paste the script into the console and run it by pressing enter. Please note, that you might need to change the the directory name in the last command, e.g., if the app version is different.
Remove-Item c:\work\mongodatabase -Recurse -ErrorAction Ignore New-Item -Type Directory c:\work\mongodatabase c:\tools\mongodb\bin\mongod.exe --dbpath c:\work\mongodatabase
Keep this window open because the server is running here. You can stop it by pressing Ctrl+C at the end of the class.
-
Launch VSCode and connect to the MongoDB server with the extension.
-
Let us create a new database by right-clicking the connection (localhost). This opens a playground script window, where we paste our database creation script from innen, and run it with the black “play” button found in the header. The name of the database should be
datadriven
. As a result, the collections should be created - open the elements of the database to check this. -
Download the starter solution!
- Open a new command prompt or PowerShell console (do not use the one the server is running in)
- Navigate to a folder, e.g.
c:\work\NEPTUN
-
Execute the following command:
git clone https://github.com/bmeviauac01/gyakorlat-mongo-kiindulo.git
-
Open the sln file from the newly created folder using Visual Studio. Let us examine this project.
- This is a .NET console application. The structure resembles the structure of the Entity Framework project seen before: directory
Entities
contains the database entities while our code will be written intoProgram.cs
. Program.cs
already contains the initialization of the connection to MongoDB.- Interface
IMongoClient
is used for all communication with the database. We will not use this directly. - Interface
IMongoDatabase
represents the databasedatadriven
within the MongoDB server. - And the
IMongoCollection<TEntity>
interfaces represent the specific collections we can use to execute queries and modification commands.
- Interface
- The database documents are mapped to the C# entity classes in folder
Entities
. A major difference compared to the behavior previously seen in Entity Framework is that these classes were not generated by written manually.- Most entities are already mapped.
- We will create one more class during an exercise.
- This is a .NET console application. The structure resembles the structure of the Entity Framework project seen before: directory
Exercise 1: Queries¶
Write C# code using the MongoDB C#/.NET Driver in the following exercises. Print the results to the console.
-
List the names and the amount of stock of all products that we have more than 30 in stock!
-
List the orders that consist of at least two items!
-
List the orders that have a total value of at least 30.000! For each order, print the customer name, and list all items of the order (with the product name, amount, and price).
-
Find the most expensive product!
-
List the products that have been ordered at least twice!
Solution
-
We need only the product collection and execute a simple query. The filter criteria can be written as a Lambda-expression and with the builder syntax too.
Console.WriteLine("***** Exercise one *****"); // 1.1 first solution Console.WriteLine("\t1.1 First solution:"); var qProductAndStock1 = productsCollection .Find(p => p.Stock > 30) .ToList(); foreach (var p in qProductAndStock1) Console.WriteLine($"\t\tName={p.Name}\tStock={p.Stock}"); // 1.1 second solution Console.WriteLine("\t1.1 Second solution:"); var qProductAndStock2 = productsCollection .Find(Builders<Product>.Filter.Gt(p => p.Stock, 30)) .ToList(); foreach (var p in qProductAndStock2) Console.WriteLine($"\t\tName={p.Name}\tStock={p.Stock}");
-
This is similar to the previous one. We may note that we would have needed a join in a relational database, but we have everything at hand here.
// 1.2 first solution Console.WriteLine("\t1.2 First solution:"); var qOrderItems1 = ordersCollection .Find(o => o.OrderItems.Length >= 2) .ToList(); foreach (var o in qOrderItems1) Console.WriteLine($"\t\tCustomerID={o.CustomerID}\tOrderID={o.ID}\tItems={o.OrderItems.Length}"); // 1.2 second solution Console.WriteLine("\t1.2 Second solution:"); var qOrderItems2 = ordersCollection .Find(Builders<Order>.Filter.SizeGte(o => o.OrderItems, 2)) .ToList(); foreach (var o in qOrderItems2) Console.WriteLine($"\t\tCustomerID={o.CustomerID}\tOrderID={o.ID}\tItems={o.OrderItems.Length}");
-
A simple query is not sufficient for this exercise; thus, we need the aggregation pipeline. We may still note that every information we need is still available in one collection.
// 1.3 Console.WriteLine("\t1.3:"); var qOrderTotal = ordersCollection .Aggregate() .Project(order => new { CustomerID = order.CustomerID, OrderItems = order.OrderItems, Total = order.OrderItems.Sum(oi => oi.Amount * oi.Price) }) .Match(order => order.Total > 30000) .ToList(); foreach (var o in qOrderTotal) { Console.WriteLine($"\t\tCustomerID={o.CustomerID}"); foreach (var oi in o.OrderItems) Console.WriteLine($"\t\t\tProductID={oi.ProductID}\tPrice={oi.Price}\tAmount={oi.Amount}"); }
-
To find the most expensive product, we need two queries: first, find the largest price value, then find the products with this price.
// 1.4 Console.WriteLine("\t1.4:"); var maxPrice = productsCollection .Find(_ => true) .SortByDescending(p => p.Price) .Limit(1) .Project(p => p.Price) .Single(); var qProductMax = productsCollection .Find(p => p.Price == maxPrice) .ToList(); foreach (var t in qProductMax) Console.WriteLine($"\t\tName={t.Name}\tPrice={t.Price}");
-
This exercise is complicated with our current database scheme because we do not have everything at hand within one collection. We need the product information from one collection, and the order details from another one.
We will be doing a "join" in the client-side, that is, in C# code. The solution's outline is to query the orders, then in C# gather the orders by product, and finally, query the product details.
// 1.5 Console.WriteLine("\t1.5:"); var qOrders = ordersCollection .Find(_ => true) .ToList(); var productOrders = qOrders .SelectMany(o => o.OrderItems) // All order items into one list .GroupBy(oi => oi.ProductID) .Where(p => p.Count() >= 2); var qProducts = productsCollection .Find(_ => true) .ToList(); var productLookup = qProducts.ToDictionary(p => p.ID); foreach (var p in productOrders) { var product = productLookup.GetValueOrDefault(p.Key); Console.WriteLine($"\t\tName={product?.Name}\tStock={product?.Stock}\tOrders={p.Count()}"); }
This solution is very elegant and works only for small databases. Suppose we face a similar task under real-life circumstances. We have two choices: denormalize the database scheme and copy product details into the orders, or create an aggregation pipeline executed by the server that does something similar to the code above (MongoDB can do that, but it will not be very fast).
Exercise 2: Create a new entity class¶
-
Examine the classes
Product
andVAT
. Why is there a field with a[BsonId]
attribute in classProduct
and not in classVAT
? -
Create a new entity class for mapping
Category
document, then add and initialize aIMongoCollection<Category>
field next to the others.
Solution
-
Class
Product
represents theproducts
collection; therefore each item has a uniqueObjectID
. On the other hand, classVAT
is an embedded field used byProduct
and has no collection on its own; hence it needs no id. -
Create our new
Category
POCO class.Let us check a few sample documents using VSCode in
categories
collection.Create a new class
Category
in folderEntities
with matching fields as below.using MongoDB.Bson; using MongoDB.Bson.Serialization.Attributes; namespace BME.DataDriven.Mongo.Entitites { public class Category { [BsonId] public ObjectId ID { get; set; } public string Name { get; set; } public ObjectId? ParentCategoryID { get; set; } } }
Add a new collection interface field in
Program.cs
as follows.private static IMongoCollection<Category> categoriesCollection;
And assign the value in the
initialize
method to get the collection.categoriesCollection = database.GetCollection<Category>("categories");
Exercise 3: Data modification¶
The collection classes IMongoColection<TEntity>
can also be used to execute modification operations.
-
Write C# code that increases the price of all products in category "LEGO" by 10 percent!
-
Create a new category named Expensive toys and move all products here that cost more than 8000!
-
Delete all categories that contain no products.
Solution
-
Find the ID of the category then update all products that have this category id.
Console.WriteLine("***** Exercise three *****"); //3.1 Console.WriteLine("\t3.1:"); var categoryLegoId = categoriesCollection .Find(c => c.Name == "LEGO") .Project(c => c.ID) .Single(); var qProductLego = productsCollection .Find(p => p.CategoryID == categoryLegoId) .ToList(); Console.WriteLine("\t\tBefore modification:"); foreach (var p in qProductLego) Console.WriteLine($"\t\t\tName={p.Name}\tStock={p.Stock}\tÁr={p.Price}"); productsCollection.UpdateMany( filter: p => p.CategoryID == categoryLegoId, update: Builders<Product>.Update.Mul(p => p.Price, 1.1)); qProductLego = productsCollection .Find(p => p.CategoryID == categoryLegoId) .ToList(); Console.WriteLine("\t\tAfter modification:"); foreach (var p in qProductLego) Console.WriteLine($"\t\t\tName={p.Name}\tStock={p.Stock}\tÁr={p.Price}");
-
MongoDB can execute the following sequence of steps in a single atomic step: "Get me category
Expensive toys
. If it does not exist, create it." We will useFindOneAndUpdate
to achieve this.//3.2 Console.WriteLine("\t3.2:"); var catExpensiveToys = categoriesCollection.FindOneAndUpdate<Category>( filter: c => c.Name == "Expensive toys", update: Builders<Category>.Update.SetOnInsert(c => c.Name, "Expensive toys"), options: new FindOneAndUpdateOptions<Category, Category> { IsUpsert = true, ReturnDocument = ReturnDocument.After }); productsCollection.UpdateMany( filter: p => p.Price > 8000, update: Builders<Product>.Update.Set(p => p.CategoryID, catExpensiveToys.ID)); var qProdExpensive = productsCollection .Find(p => p.CategoryID == catExpensiveToys.ID) .ToList(); foreach (var p in qProdExpensive) Console.WriteLine($"\t\tName={p.Name}\tPrice={p.Price}");
-
Query categories that contain any product, then delete the ones that do not belong among this list.
//3.3 Console.WriteLine("\t3.3:"); Console.WriteLine($"\t\tBefore modification: {categoriesCollection.CountDocuments(_ => true)} categories"); var qProductCategory = new HashSet<ObjectId>( productsCollection .Find(_ => true) .Project(p => p.CategoryID) .ToList()); categoriesCollection.DeleteMany(c => !qProductCategory.Contains(c.ID)); Console.WriteLine($"\t\tAfter modification: {categoriesCollection.CountDocuments(_ => true)} categories");
Let us note that this is not an atomic operation. If a product was added concurrently, we could have deleted its category.