<html><body>Dealing with failures and making distributed systems fault-tolerant is such a vast topic which we cannot discuss enough in just a blog post. Therefore, we will talk about a small subset of this topic in this blog post, dealing with failure when communicating between our services.<h2 style="text-align:justify;">1. Failure is everywhere</h2>According to this Fallacies of distributed systems. Your HTTP request will either fail at some point due to network error or take a very long time to respond.     It’s not just about the network but the failures. Let’s think about something else like your hard drive, what would happen if you read a file but it doesn’t exist because someone deleted it, or even worse, the hard drive failed. What if one of your servers just crashed for no particular reason? What if your Database somehow cannot be connected? Shortages? After all, failure is everywhere and cannot be prevented completely.<h2 style="text-align:justify;">2. What is fault tolerance?</h2>Fault tolerance in computer systems refers to the ability of a system to continue operating without interruption when one or more of its components fail. As mentioned in the above section, failure is everywhere and inevitable, we cannot prevent it but aim to detect, isolate and resolve problems preemptively.That’s enough information about failure and fault-tolerant systems. Let’s dive into some patterns when making HTTP requests between services. (These examples below are written in C# and .NET Core project, the syntax may be different for other languages but the pattern should remain the same).<h2 style="text-align:justify;">3. Timeout pattern</h2>A slow HTTP request doesn’t degrade your system but a bunch of slow HTTP requests does. It doesn’t matter that you are making a GET request to serve the data to the client or just simply exchange information between services.Let’s take a look at the example below, service A is making a request to service B, it takes roughly around 30s for service B to produce a response. That’s slow, but still works.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-kc5rPPuzU5ZTuiNtE5KhpXE2HysTMjqXwrcXYBrm.png">Figure 1: Communication between Service A and Service B demonstrationIf we scale this scenario up to 30 users making the request from service A to service B, these requests hold 30 Ephemeral sockets for 30 seconds waiting for responses from service B.     In this example, Service A has id 26900, Service B has process id 32176. Take a look at these sockets being held by process A, the resource could be exhausted if the number of requests goes up.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-zQLHwyOKBKFm8vL6O4wQY6Lyt2VWhhTD5HeIISVV.png">Figure 2: Process Id of Service A and Service B in Task Manager<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-i6wDek5UeOTk5iTriu5Msmo27eD59it81vGHfLzv.png">Figure 3: Ephemeral sockets hold by Service AThat’s a lot of resources, the waiting time is not even acceptable for the user. That’s why we came up with the Time Out pattern. If the request takes longer than expected, we just give up.     The good thing about the Time Out pattern is that we can detect slow running operations and give up before it consumes all the resources. There’s a downside of this pattern, you have to configure the timeout threshold appropriately so your system remains responsive and doesn’t drop your requests if sometimes they take a bit longer than expected.Timeout pattern implementationFor the sake of simplicity, we will use Polly.NET for all our examples in this post. It’s a library which helps us implement a resilience system so we don’t have to write all the boilerplate code.After initiating the source, you need to add the package Microsoft.Extensions.Http.Polly to your project via Nuget Package Manager or via command dotnet add package Microsoft.Extensions.Http.Polly if you are using dotnet cli.First, we define the timeout policy in your Program.cs file (If you are using an older version of .NET core, add them to Startup.cs ConfigureServices method instead).<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-01r7pQUdCuW2Ac3cjS42rhV3KfwtWEQjupU1pn2c.png">Figure 4: Defining timeout policyWhen you bootstrap your HttpClient via method AddHttpClient(“”), add these following lines of code.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-FGmaAR2cNDnYsr5PbynN5BxGzPPEVDsBw5bNMe4C.png">     Figure 5: Use timeout policy when configuring HttpClientLet’s create a HttpClient from IHttpClientFactory and use it to see if our code works.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-erkORQd4jesFSPQqWuhs7RggC1iLxivXA9HVkvPL.png">     Figure 6: Create a HttpClient from IHttpClientFactory and make a requestOur request took more than 5 seconds to respond and a TimeOutRejectedException was thrown.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-q7loP0xTMBOgKeFiaUqe0P3jFWvF3JmFTFvUxKzo.png">     Figure 7: TimeOutRejectedException on request timeoutThe good thing about Polly’s implementation is that there are a lot of overloads to suit our needs. For example, we can execute some code when the request is timed out by adding a callback like this:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-yd00HnUpbqk8HgBbtvNFpbzomsT3SNlISyA9vnoG.png">     Figure 8: Add a callback when timeout happensThis code will be executed every time the request timed out.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-azBearfjQl2iNsxQvjWaWDDJqdwGekd1ylYlrNVy.png">Figure 9: Callback gets executed when timeout happensRetry Pattern     Sometimes, a component or a service is just unavailable for a short period of time, this kind of fault is called “transient fault” and usually takes a short time to go away. This usually happens during maintenance or recovery from a crash. In this scenario, it makes sense if we retry the request rather than logging the error and abort it.Before you implement the retry pattern, there’s a couple of points you need to consider:<ul><li style="text-align:justify;">Is the failure transient? If the answer is yes, you should retry because the error seems to disappear quickly. If the answer is No (for example, you get Unauthorized error) this kind of error is not gonna disappear quickly, the pattern does nothing helpful here but add another layer of complexity and overhead.</li><li style="text-align:justify;">How long should we delay between each retry? For tasks like serving data to the client, the delay should be as short as possible. However, if you are doing some long running or background tasks, the delay should be long enough for another service to be alive.</li></ul>This pattern also has a downside, if your service is actually dead or having difficulties to recover. The pattern would add more pressure to the service.<h3 style="text-align:justify;">Retry pattern implementation</h3>Add these lines to your Program.cs file just like in the previous example.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-SVRIizfMV8mhayyEvC8QdVYS4pyWvSvRL7lQlevQ.png">     Figure 10: Add retry policy when configuring HttpClientThen inject IHttpClientFactory in your class and create HttpClient like this.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-O3bSfvUAFQLIjS9ODs3wlyofysEJmLjEXmgqiTeL.png">     Figure 11: HttpClient usageFrom now on, whenever you create an HttpClient using IHttpClientFactory. Retrieving 5xx status code, 408 status code or a Network failure (System.Net.Http.HttpRequestException) would result in 3 retries.      If the predefined policies of Polly don't suit your needs, refer here for more custom configurations and policies.<h2 style="text-align:justify;">4. Circuit breaker</h2>Besides transient faults, there can also be situations where the retry fails so many times. In this case, we can’t just keep retrying forever, we should take a break and give the other service time to recover before we can try again.A circuit breaker acts as a proxy for operations that may fail (in this case, it’s our HTTP request). This proxy monitors the number of recent failures, then uses this information to decide whether to allow the operations to proceed, or simply returns an exception.     The proxy could be viewed as a state machine that mimic the functionality of an electrical circuit breaker:Closed: The proxy maintains a count of recent failures, If a request fails, it increases the counter by one. If the counter exceeds the threshold, it changes to Open state and starts a timer, when this timer expires, the proxy switches to half-open state.Open: In this state, any http request fails immediately and an exception is returned.Half-Open: A limited number of requests are allowed to pass through, if all those requests are successful, the state is changed back to Closed. If any of them fails, the proxy assumes that the problem is still present and hasn't been fixed yet so it reverts back to Open state and restarts the timer again.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ipCu0UXu3sSw19B0Z7xwbe3iNJhggsvk3PCb8tgC.png">Figure 12: 3 States of circuit breakerCircuit breaker implementation     In order to define this pattern. First, we create a Polly’s policy.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-u10vKlJ1igA4kQIKPEcVyI9Bc2VVBHhrzWb9ZJXL.png">     Figure 13: Create the circuit breaker policy.Then, add this policy to IHttpClientBuilder using the extension method AddPolicyHandler like this.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-9SoYB8tQBQNQa2tSl3LszcpD9S0ASWHsoyQMk2wt.png">     Figure 14: Use circuit breaker policy when configuring HttpClient.The consumer code of IHttpClientFactory should be the same as the Retry pattern.     From now on, if your requests fail 5 times in a row, it will take a break for 30 seconds, any requests made in this 30s period will be automatically returned immediately with exception BrokenCircuitBreakerException.     After that 30s, if your first request succeeds, the state of the circuit will be changed back to Closed, otherwise it will be set back to Open and waits for another 30s.     If the default behavior doesn’t cover your need or you want to manually control the circuit, you can find out how to customize it here.<h2 style="text-align:justify;">5. Bulkhead pattern</h2>The Bulkhead pattern is a type of application design that is tolerant of failure. It’s all about isolation, resources consumed for communication with one service should not affect the communication with another service.     For example: Your service has 2 third-party integrations. Let’s call the first one Service A and the other is Service B.     If Service A is not available, all the failed requests your service makes to Service A would quickly stack up and cannot be released in a timely manner. Those requests to service B soon will be affected (because we use the same resource for making requests to both the services). Bulkhead pattern prevents this by isolating the resources usage of service A and B. Therefore, if service A failures stack up, it would not affect Service B at all.      Bulkhead pattern implementation     In your Program.cs, create the bulkhead policy like the code below.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-HTotpKag962UEVk2QuzZvpIUPvdrDxEplS3xVsCO.png">     Figure 15: Create a bulkhead policyThen, we can use the bulkheadPolicy as follow:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-3QSCTR0903SUDmM8nr5rxMbZqEwSSyifHxvqiFk7.png">     Figure 16: Use bulkhead policy when configuring HttpClient.In this case we allow only 1 concurrent request at the time, if more than 1 request are sent, your code will throw a BulkheadRejectedException.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-zrXfOrYM02NZXh8faP4NwzHPRrk3Csy5kgmKOTP5.png">Figure 17: BulkheadRejectionException thrown if resource exceeds limit.Polly allows us to be more flexible by adding some overload to the BulkHeadAsync method which allows us to queue the requests and handle rejection via params maxQueuingActions and onBulkheadRejectedAsync. You can read about it more here.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-MhFd3jXsAFQd1TrJiMad6iDRnq4r7VKTZrdr53s4.png">     Figure 18: Another overload of BulkheadAsync method with more parameters I hope this brief introduction to failures, fault-tolerance system and fault-handling can help you to think differently about your problem.  </body></html>

Did you know that in distributed systems, sometimes things just don't work the way they should because of failures. Hence, you must bear in mind that failure is everywhere and think differently about how to solve your problems.

Fault tolerance patterns for microservices

<html><body><h2 style="text-align:justify;">What Is Logging?</h2>Logging is the process of recording events and messages that occur in a system. These messages are usually stored in a file, but they can be sent to other destinations such as databases or other monitoring software.Logging is a general purpose, but we mostly use it for: <ul><li style="text-align:justify;">Troubleshooting and debugging: Logging provides historical activities of a system, making it easy to find problems such as exceptions in the code. We can analyze the log to see if it happened and try making educated guesses.</li><li style="text-align:justify;">Auditing and security. Log can be used to audit security events, user activities and system changes.</li><li style="text-align:justify;">Performance monitoring: Log can capture performance-related data such as the response time of a system. We can use it to identify bottlenecks and potential performance problems.</li></ul><h3 style="text-align:justify;">Logging Level</h3>Logging Levels are defined differently for each implementation. However, they usually contain these 4 levels:<ul><li style="text-align:justify;">Debug: This level contains detailed information for developers to perform troubleshooting. We usually don't turn on this level in the production environment because it could produce a log of logging data.</li><li style="text-align:justify;">Information: This level logs information on system operations.</li><li style="text-align:justify;">Warning: This level indicates potential issues that might require attention but are not critical yet.</li><li style="text-align:justify;">Error: This level indicates that an error occurred in the system.</li></ul>In .NET Core, the framework defines more log levels to distinguish their intention even further. These log levels include Trace, Debug, Information, Warning, Error, Critical, and None.In the following sections, we will put them into practice and have a better understanding of the log level.<h2 style="text-align:justify;">How logging is implemented in .NET Core?</h2><h3 style="text-align:justify;">ILogger, ILoggerProvider and ILoggerFactory</h3>Logging in .NET core is built of so many components and classes. However, I will only introduce 3 main components that we usually interact with and use most of the time in the framework.<ul><li style="text-align:justify;">ILogger is the interface that represents a type we interact with to perform the logging. In the real world, we don't usually use this interface but ILogger&lt;T&gt; instead. The ILogger&lt;T&gt; eventually inherits the interface ILogger. This interface exposes a method that we can use to write logs with appropriate log levels such as Log, LogCritical, LogDebug, LogError, LogInformation, LogTrace, LogWarning… You can find more about the interface here.</li><li style="text-align:justify;">ILoggerProvider is used to create ILogger instances. Applications are not supposed to use this interface directly, they are only supposed to use ILoggerFactory to create instances of ILogger.</li><li style="text-align:justify;">ILoggerFactory is also used to create ILogger instances and is supposed to be used in the application code. We will register the ILoggerProvider with ILoggerFactory so that the framework will use it.</li></ul><h3 style="text-align:justify;">How is the Logger bootstrapped internally?</h3>We can set a breakpoint in the Program.cs to see how it bootstraps the logging mechanism.By inspecting the IServiceCollection, we get some insights into what the framework does under the hood. By default, the framework registers some ServiceDescriptor for all the interfaces that we mentioned in the previous section.<figure class="image image_resized" style="width:90.73%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200-2ORZaFFAAwF9jIwXOe3mww6XyA9J8sxjpOMYg5EZ.webp"></figure>Figure 01: Code change to inspect IService CollectionFurther analysis of the IServiceCollection at runtime reveals that the framework bootstraps an ILogger, ILoggerFactory with Singleton lifetime. We have 4 ILoggerProvider registered by default (ConsoleLoggerProvider, DebugLoggerProvider, EventSourceLoggerProvider, EventLogLoggerProvider). Calling services.Logging.ClearProviders() would clear all the ServiceDescriptor from the IServiceCollection.<figure class="image image_resized" style="width:91.21%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(1)-g2fZTlt0NQBmKZQHlpTYBvCKcrfXtCjuGjMIL1tE.webp"></figure>Figure 02: Inspect IService Collection at runtime<h3 style="text-align:justify;">Logging Scope</h3>The interface ILogger&lt;T&gt; comes with a very handy method BeginScope that could be used to log a portion of code that has the same attribute (state). This method doesn’t show any distinct output when we use the default console logger. However, it comes very handy when we use Serilog and Seq.<figure class="image image_resized" style="width:91.22%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(2)-o7w2Fra4PuT0cKfKhVTKxbtsDRyKfsL0aRRLXyss.webp"></figure>Figure 03: Logging Scope example<h3 style="text-align:justify;">Log Filtering</h3>The framework also allows us to configure log filtering by specifying the desired log level for a specific log category via the AddFilter method. If no log filter is specified, then the minimum log level is applied.<figure class="image image_resized" style="width:92.07%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(3)-qfgE7ietMdQ7vdJV3c9NKVp9HYHCGYVadU3CdxzR.webp"></figure>Figure 04: Log filter<h2 style="text-align:justify;">What is Serilog and Structured Logging</h2><h3 style="text-align:justify;">What is Serilog?</h3>Serilog is a logging library for .NET and .NET Core that supports structured logging. It has the ability to store (sink) logs using a plain text file, Database or even to Seq (another software that supports the GUI for log querying and visualization)...<h3 style="text-align:justify;">How does Serilog work?</h3>We won’t dig deep into how Serilog works internally. However, we will take a glance at the highest level, by using the same technique that we used to observe the way the framework bootstrap logging. We can see that Serilog simply just replaces the ILoggerFactory by adding another ServiceDescriptor for ILoggerFactory. <figure class="image image_resized" style="width:90.57%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(4)-gDujVcWh87TP9YzH15lMkEJaX9h5DzpYH1pVEGIt.webp"></figure>Figure 05: Serilog replace ILoggerFactory using their own implementation<h3 style="text-align:justify;">Serilog Integration Guide.</h3>In this guide, we will integrate the .NET 8 with Serilog and we will try to sink the log to the file and Console. We will use the template  which contains all the code inside the Program.cs (without the Startup.cs file) file so that it would be easier to keep track of the post. First, we will bootstrap our project by using the following command in Windows Terminal:dotnet new webapi -o SerilogIntegrationGuideThe SerilogIntegrationGuide is not mandatory, It's the name of your project and you can name it whatever you want as long as it respects the naming convention of your operating system.After the project is fully created, then we can use the following command to install Serilog:dotnet add package Serilog.AspNetCoreBefore we change our code to integrate with Serilog, let's capture the output of our current program by hitting an endpoint of our web api.<figure class="image image_resized" style="width:90.01%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(5)-RjtpqiZOg8w9XEh0Dup4jo4Pu5keOrDcdBMGYcsL.webp"></figure>Figure 06: Plain output of the default console loggingThe output is plain text without any format and is very hard to read. Now let’s add the Serilog package and change some code before we observe the output again.<figure class="image image_resized" style="width:90.77%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(6)-abEpolo7L6wuAFKXpnsFbJDroG8tbOqV16KtAOPo.webp"></figure>Figure 07: Serilog integration code changeSerilog integration is very easy, by creating the LoggerConfiguration and adding the line builder.Services.AddSerilog(), we have completed the integration with the logging library. Now let’s run the project, hit the endpoint and observe the output again.<figure class="image image_resized" style="width:90.47%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(7)-wKZINeEsgwMWrcxJ0tdJeB3PArA6c1yXIb9CRR8Y.webp"></figure>Figure 08: Serilog console outputThe output is nicely formatted.Now let’s try sinking (save) the log into the file instead. In order to sink the log to the file, we have to install another nuget package by running the following command in the terminal  dotnet add package Serilog.Sinks.File --version 5.0.0After running the command, we have to make some changes to the code in order to make the logs dump to file.<figure class="image image_resized" style="width:90.99%;"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/pasted%20image%200%20(8)-4jugdk90SKZgefqWLq401TBJ0qnoixKqL328xz1k.webp"></figure>Figure 09: File rolling integration code changeThen, we can run the project again and observe that the log file is created inside our project folder. The second parameter “rollingInterval” indicates that a new log file will be created after each hour so the log file is guaranteed to be small and contains only the logs with the scope of 1 hour.<h2 style="text-align:justify;">Conclusion</h2>In conclusion, logging provides a vital mechanism for understanding and maintaining your C# applications. By leveraging Serilog, you gain a powerful and flexible tool to capture, enrich, and store log data. This approach empowers you to troubleshoot issues efficiently, monitor application health, and gain valuable insights into system behavior. As your application evolves, Serilog's scalability and extensibility ensure it can continue to be a cornerstone of your observability strategy.Resources<ul><li>Demo source code: <a href="https://github.com/saigontechnology/Blog-Resources/tree/main/DotNet/do.tran/LoggingInDotNet">https://github.com/saigontechnology/Blog-Resources/tree/main/DotNet/do.tran/LoggingInDotNet</a></li></ul></body></html>

The software industry prioritizes performance, reliability, speed, and scalability. But there's another equally important factor: observability. This post delves into logging, a crucial aspect of achieving observability in software.

Logging And Structured Logging With Serilog The Definitive Guide

<html><body><h2>What is TPL Dataflow?</h2> TPL Dataflow (Task Parallel Library Dataflow) is a .NET Framework library designed for building robust and scalable concurrent data processing pipelines. It offers a declarative model in which you define a network of interconnected "blocks" that process and transport data, enabling efficient and flexible parallelism.Unlike the traditional programming model, which usually requires callbacks and synchronization objects, TPL Dataflow provides building blocks that allow you to construct a pipeline-like dataflow specifying how your data should be processed and the dependencies between your data. For example, we can create an image processing pipeline that processes an image using four blocks that connect. You can think of it as in the diagram below.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-CGIUr0NJHwe4MfQ0rWhCmarkFjt2Tj2LPwUUOO0t.png">Figure 1: Example diagram of a TPL Dataflow workflow<h2>How to use TPL Dataflow</h2>TPL Dataflow is not included by default. So, if we want to use it, we will have to install the NuGet package named System.Threading.Tasks.Dataflow. In .NET Core and later versions, we can use the dotnet CLI to install it using the command dotnet add package System.Threading.Tasks.Dataflow or by using the NuGet Package Manager.<h3>Source block</h3>As I have mentioned in the previous section, TPL Dataflow contains building blocks. If a block produces data, then we call it a source block. A block that receives data is called a target block.However, this distinction is only a relative concept because a block could send data (as a source block) to one block and receive data (as a target block) from another block. In this section, we will take a glance at the source block.It's defined by the interface ISourceBlock&lt;TOutput&gt;, in which TOutput is the type of output that the source produces. Microsoft has some predefined blocks that inherit from this interface. For example: BufferBlock&lt;T&gt;, BroadcastBlock&lt;T&gt;, TransformBlock&lt;T&gt;, TransformManyBlock&lt;T&gt;... However, you can also create your own block by creating a new class and implementing this interface. Here’s the structure of the ISourceBlock: <img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-pntJW25udCto1dcl9yFx4bS5BDOJPORbCiaQaKfW.png">Figure 2: Structure of the ISourceBlock interface<h3>Target block</h3>A target block receives data from another block and performs work based on that data. Similar to a source block, a target block is defined by the interface ITargetBlock&lt;TInput&gt;, in which TInput is the type of data it receives. If a class inherits from both ISourceBlock and ITargetBlock, we should inherit it from the IPropagatorBlock&lt;TInput, TOutput&gt; instead. This interface inherits from both the aforementioned interfaces and is usually used to express the intention of receiving and producing data. Here’s the structure of the ITargetBlock.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-mdFDxdbh4TmZlArlDRar9dZr6xPBzrKwRAa0YGRQ.png">Figure 3: Structure of the ITargetBlock interfaceAnd the structure of IPropagatorBlock:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-NNlizPNmYiLw1gSBf5cS2fowavrLo4qx9ASpoGlT.png">Figure 4: Structure of the IPropagatorBlock which simple inherits from ISourceBlock and ITargetBlock<h2>Receive/Send data to blocks</h2>TPL Dataflow provides methods so that we can send/receive data synchronously or asynchronously. To send/receive messages synchronously, we can use the methods Post and Receive, respectively.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-aZybbYRwdWLDzSF9h7qJMXRguAhlhcBUioVMhFbd.png">Figure 5: Example of sending and receiving a message in a TPL Dataflow block.The above code would output 1 on the console screen.Similarly, we can use the SendAsync and ReceiveAsync to send and receive data to the block asynchronously.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-dppIfhuF7deKQE5zx3bCoLHgQMxqXbgIX6AOAPjy.png">Figure 6: Example of sending and receiving a message in a TPL Dataflow block asynchronously.<h2>Predefined blocks</h2>In this section, I will introduce you to some predefined blocks that Microsoft has crafted for us. These blocks are general-purpose and usually take delegates as parameters so that we can define the underlying behavior.<h3>Buffering block</h3>The buffering block BufferBlock&lt;T&gt; is a general-purpose block that acts as a queue data structure, storing data in a FIFO (first in, first out) manner. It could be written by multiple blocks and read by multiple blocks as well. However, an individual message only gets delivered to one block.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-UWQAX931BZn2VsVRJDI5FloG6I9vipXftz3DEmEf.png">Figure 7: Example of using BufferBlockThe above code would output 1,2,3 respectively.<h3>Action block</h3>An ActionBlock is a target block that performs a predefined action upon receiving a message. For example, printing to the console, sending an email, or writing to a file.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-RApxfIs5reg1hqSuGjYpGdZJUNOAJ8CHqjhGUapW.png">Figure 8: Example of using an ActionBlockIn this example, we create a buffer block that receives messages and sends them to the action block. Note that we create an action block and pass an Action&lt;TInput&gt; into its constructor to define the behaviour. The action block prints 1, 2, and 3, respectively.<h3>Transform block</h3>A TransformBlock is a block that takes an input and produces some output depending solely on the given data. In other words, it could be considered the map() operator of a block.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ZsHx93gmlkJnPv3qTvATKMZTriduMG47LCZ5CmPu.png">Figure 9: Example of using TransformBlockThe above example creates a transform block that takes an integer and produces a string with the format 'Here's the given number {}'. In this example, I made it simple and compact by using the synchronous version of the TransformBlock. However, we can also produce the output using asynchronous operations instead.<h2>Example use case and implementations</h2>In this section, we will piece everything from the previous section together to get a bigger picture of how it is used in the real world. This involves implementing a simple use case that makes everything clear.The use case is that we are writing code for a weathering system that receives a message containing a date as the payload. Upon retrieving the message, our system reacts by responding with the average temperature of the day it received to multiple media channels (Facebook, Telegram, Twitter).First, we will define the BufferBlock to handle all the messages that arrive<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-8RmN4VpkGKN5F3mm84AiEx3VUiu8YYzmJYpgSgk6.png">Figure 10: Defining the buffer block to store incoming messages.Then, we define the next block that takes the data from the first block and performs a query based on the provided date.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-yGdVVWtQH4FNK0vSwRHIKbSuyyoBuuXjJqN8dC4V.png">Figure 11: Create a TransformBlock that returns the database on the provided inputWe will need another transform block to map the list of temperatures on a date to the average temperature. This piece of logic could be put together with the previous block. However, separating them would give us better control of individual logic, so I put it into another transform block instead.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-BhXXb2qFpdYHTBu51pGG2OJzCzxwcMxrfqQcF3JO.png">Figure 12: Creating another TransformBlock that takes a collection of temperatures and returns the average temperature.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-vsT8MEARLCiyVdwxcNdKh8JnPfgzux6M00DKRBT9.png">Figure 13: Creating 3 ActionBlock that simulate sending messages to media channels.Now, if you recall that we need to send to all three channels at the same time, we will have to define some mechanism that can do that. Luckily, we have the predefined BroadcastBlock that can handle this. We will first define a broadcast block and link it to the three blocks that we defined in the previous sections.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-mYvQsr3bcIp82HzkNOpJGVcqpBOkK8LYjDExTFU9.png">Figure 14: Defining a broadcast block.We’ve got everything we need, now it’s time to link everything together and run our application.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-YiSApsAMAcReUv6AaHibLwQ809AhU65PMh4obaQE.png">Figure 15: Link all the blocks and run the application.You should see the results printed out on the console. Because each block executes its action asynchronously and separately from each other, we don’t guarantee that the console prints out the messages in the order that they arrive. However, the behaviours of the program remains correct.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-M7LRZBcS8Gqu3KuA5oBeYZ7fNHA4PAjyorfomMPA.png">Figure 16: The result is printed on the console after running the application<h2>Conclusion</h2>Just like everything else, TPL also has its pros and cons. We have seen the cons in our journey; however, I would like to reiterate them here.Pros:<ul><li>It makes our code cleaner than the imperative way of synchronization, giving us more control over each individual process.</li><li>It's suitable for pipeline-like processing where we can split our process into smaller individual processes that could work independently from each other.</li><li>Dataflow works in a message-passing manner in which one block passes messages to another block. The work of each block is performed in its own context (thread) that could be controlled via ParallelismDegree, making the management of threads easier so we could focus better on our business.</li><li>It gives you Reactive programming to some degree where you could react to events.</li></ul>Cons:<ul><li>It gives you less control over the underlying synchronization. As I mentioned in previous sections when working with TPL Dataflow, you don't have to deal with all kinds of synchronization; however, it works on top of synchronization underneath. This gives us better focus on our business but less control over threading and tasks.</li><li>Even though it's suitable for parallel processing and you can control the degree of parallelism, if you use it the wrong way, it could result in performance loss.</li></ul>The source code used in this post can be found <a href="https://github.com/saigontechnology/Blog-Resources/tree/a99373882354e578cf37bd84ac97bc3c2225b7fb/DotNet/do.tran/Dataflow">here</a>.</body></html>

We have been writing imperative code with primitive synchronization since the first day. However, Microsoft has a hidden gem that also allows you to deal with asynchronous operations, helping you split complex tasks into small and manageable pieces. In this post, we will take a look at TPL Dataflow in .NET.

Introduction to TPL Dataflow in C#

<html><body><h2 style="text-align:justify;">1. What is a channel?</h2>Channels in C# are based on the concept of message passing, which means that the producer sends messages to the channel, and the consumer receives them. The messages can be of any type and they are stored in the channel until the consumer is ready to receive them. Channels can also have a bounded capacity, meaning that they can hold a fixed number of messages before blocking the producer until the consumer makes room for new messages.In this way, channels provide a simple yet powerful way to implement asynchronous communication and coordination between different parts of a program, and they are becoming increasingly popular in C# and other modern programming languages.Channels are implemented using the System.Threading.Channels namespace, which provides 2 types of Channels, including:<ul><li style="text-align:justify;">UnboundedChannel: This type of Channel can hold unlimited messages.</li><li style="text-align:justify;">BoundedChannel: This type of Channel can only hold a limited amount of messages. We can configure the behavior when the queue is full using BoundedChannelOptions.FullMode. Besides, the library also let us configure more properties such as:</li><li style="text-align:justify;">SingleWriter: This doesn't force the runtime to accept only one writer and throw an exception if there is more than one writer to the Channel. If this property is set to true, the channel may be able to optimize some certain operations based on the assumption that there is only one writer.</li><li style="text-align:justify;">SingleReader: Same as SingleWriter but for reading side</li><li style="text-align:justify;">AllowSynchronousContinuations: Indicates that continuation could be invoked synchronously or not. The option is false by default. We should be careful and perform some measurements before and after turning this flag on.</li></ul><h2 style="text-align:justify;">2. Basic Usage</h2>In order to use the Channel. First, we create a Channel instance via Channel type’s static method like below.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-9ufxQdbsjPF2DajNxdF3Ky2cHT0HG5buAQeikSVO.png">Figure 01: Creating a Channel instanceIn the preceding example, we created an unbounded channel, it can send messages of type int.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-QeHHTvPRO04dhUkBiJbMLhXh7MRIF5PtIhk5auRO.png">Figure 02: Sending messages using Channel.Then, we can write some data on the writer part of the channel and read it using the reader part. We can also use channels for synchronization in a multi-threaded context.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-odkiAsIJ8aEePYoSz6D4mKkdgCUEUPY0jaLTKqgs.png">Figure 03: using Channel in a multi-threaded context.The preceding code created 2 threads, the first one reads from the channel and the second one writes to the channel. Only after the value is written into the channel by the writer thread, the reader thread can continue its work. We can think of Channel as a queue in this case, the reader attempts to read the message. At the time of reading, there's no message yet. Once the message arrives from the writer, the reader could continue to work on its task.<h2 style="text-align:justify;">3. Unbounded channel</h2>UnboundedChannel is a type of channel that provides an unbounded buffer for messages of type T. It is available in the System.Threading.Channels namespace and can be used to create a channel that can hold an unlimited number of messages. An UnboundedChannel can be useful in scenarios where the number of messages being produced is not known in advance, or where the producer and consumer are processing at different speeds. However, it is important to note that an unbounded channel can potentially consume a large amount of memory if the rate of production exceeds the rate of consumption, so it should be used with care.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-l4c00yg2SeGWvJDTBv5CRYQnB0Kr5zKApuRwDJsz.png">Figure 04: UnboundedChannel exampleThe example code above should print out "Received item 0", .... "Received item 100" as the same order messages were written.<h2 style="text-align:justify;">4. Bounded channel</h2>Same as UnboundedChannel, BoundedChannel is also a type of channel in C#. The only difference is that it provides a buffer of a fixed size for messages of type T. A bounded channel can be used in scenarios where there is a producer that is generating data at a faster rate than a consumer is processing it, or where a consumer is able to handle a fixed number of items at a time. By specifying a limit on the buffer size, a bounded channel can help to control the amount of memory that is consumed by the channel.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-KBgLl8fSZ0DPscL5CzeHoq6AmOZu4rOCdI6fo2nv.png">Figure 05: BoundedChannel exampleIf you notice, the example for BoundedChannel is exactly the same as UnboundedChannel in the previous section. The output is also exactly the same as the UnboundedChannel section.    The same behavior as UnboundedChannel occurred because the consumer is so fast that it could process all the messages as fast as the producer's speed. Let's tweak our code a bit to get some different behaviors. First, we could tweak the BoundedChannelOptions upon creating the channel and decide what happens to our messages when our buffer is full. Instead of using the constructor that takes an int as parameter.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-PnRoXUOUAyJlDBF97XCxPyKtJxsvu665aiSrHiv1.png">Figure 06: BoundedChannel which takes int as param in constructorWe use the other overload which takes BoundedChannelOptions<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-Kfr5HmZm2QVI06Ce2TflaSFLZLqTZ734tXaWXUMR.png">Figure 07: Using the other overload for constructorThen, we also tweak the speed of our consumer to make the queue full<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-OlcUozirjZt1MMPL3cYzVRfH9wwvlgPkykHYbdoj.png">Figure 08: Tweaking the consumer’s consumption speedWith the changes applied, the code should output only "Receive item: 0"..."Received item: 9". The BoundedChannelFullMode.DropWrite configures the channel to drop any new messages when the channel is full. Because the consumer is 10 times slower than the consumer, the queue is still holding 0...9 and being processed by our consumer. All the messages from 10..99 are dropped.We can also use other options to configure FullMode like DropOldest. Using this option will drop the first oldest message in the queue. Our code now should output only "90...99". We have another slightly different option is DropNewest. The newest message will be dropped when the queue is full and more messages arrive. In this case, the output is 0...8,99.In the next sections, I will give you some patterns that could be useful for real world scenarios.<h2 style="text-align:justify;">5. Single producer, Single consumer</h2><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-yyRaBWNzyRUpU4MRzuVgylyTOKLQW1GzMzCYO6BY.png">Figure 09: Single producer single consumerIn this example, we create a producer and a consumer that barely keep up with each other. Just like the example in the previous section, we let them run concurrently and consume all the messages till complete. Once completed, we mark the writer of the channel as completed.<h2 style="text-align:justify;">6. Multiple producer, single consumers</h2><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-MjdjSdQTfE2DLB66jQOzkC0uGcNRgAMNUgwOOBi7.png">Figure 10: Multi producer single consumerIn this example, we first create 1 producer and 1 consumer. Since the consumer's speed is much faster than the producer's speed, we add another producer to balance things better. In a real world scenario, we could use this pattern to utilize resources and increase throughput because the consumer is faster and could be idle most of the time.<h2 style="text-align:justify;">7. Single producer, Multiple consumer</h2><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ADyWmeOyhfkzh6pzRV8FdnsOJLUmlUfKaf8sM40e.png">Figure 11: Single producer multiple consumersThis example demonstrates a quite common scenario, the producer is fast and the consumer is much slower. In such cases, we could add more consumers to balance the differences between the consumer and producer. In our case above, by scaling up to 3 consumers, we could process 3 messages in parallel. Note that when we scale the consumer up to 3 instances, the consumer can't keep up the pace entirely with the producer. Since the channel is unbounded, running this for a long time could overwhelm the consumers.<h2 style="text-align:justify;">8. Conclusion</h2>In conclusion, channels in C# provide a powerful mechanism for implementing concurrent and asynchronous communication between different parts of a program. They allow for safe and efficient sharing of data between threads or asynchronous operations, without the need for complex synchronization mechanisms.Using channels, we can create flexible and responsive applications that can handle a high degree of concurrency and parallelism, while avoiding common pitfalls such as race conditions or deadlocks.The source code used in this post can be found <a href="https://github.com/dotransts/channel_example">here</a>.Source: https://saigontechnology.com/blog/implement-producerconsumer-patterns-using-channel-in-c</body></html>

In C#, Channel is a type that enables communication between two or more asynchronous operations or threads, allowing them to exchange data in a safe and efficient manner. In this post, demonstrate the use of it and the implementation of some patterns through a step-by-step guide.

Implement Producer/Consumer patterns using Channel in C#

<html><body><h2 style="text-align:justify;">1. Introduction</h2>According to <a href="https://jwt.io">jwt.io</a>:“JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed.”In a nutshell, JWT allows us to exchange data securely using cryptographic algorithms.When people refer to JWT, they usually mean JWS (Json web signature) which has 3 parts separated by a dot ("."). Although anyone can see its payload. It’s considered secure because the jwt can be verified using the signature part. We can use websites like <a href="https://jwt.io">&lt;&gt;jwt.io</a> to debug our generated token.Along with JWS, we could represent a JWT using JWE(Json web encryption) which also adds confidentiality to the result. With this type of encryption, only the consumer knows what is in the payload.In this post, I will cover only the JWS generated by the HS256 algorithm and reserve JWE for another post for the sake of brevity and simplicity. So, everytime i mention JWT in this post, I'm talking about JWS and not JWE.If you are interested, you can read more about JWE <a href="https://www.rfc-editor.org/rfc/rfc7516">here</a>.<h2 style="text-align:justify;">2. Jwt structure</h2>Jwt contains of 3 parts:- Header- Payload- Signature<h3 style="text-align:justify;">Header</h3>The Header of JWT is a JSON object. This part identifies which algorithm is used to generate the signature, it usually contains 2 fields "alg" and "typ".For example:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-CnNWoXbd2tGDLp2T8F7Hsy50b8HA4FGNvoXjBvvA.png">Figure 01: JWT Header exampleThe above example states that the Signature will be created using HS256 (HMAC with SHA-256 algorithm) using header and payload as input parameters.<ul><li style="text-align:justify;">"alg" is the algorithm used to generate the signature. </li><li style="text-align:justify;">"typ" is the media content type of the JWT token. According to <a href="https://www.rfc-editor.org/rfc/rfc7519#section-5.1">RFC 7519, Section 5.1</a> "This parameter is ignored by JWT implementations; any processing of this parameter is performed by the JWT application".</li></ul><h3 style="text-align:justify;">Payload</h3>Payload is also a JSON object containing multiple claims. We have 3 claim types:- Registered claim names (Defined in <a href="https://www.rfc-editor.org/rfc/rfc7519">RFC 7519</a>). - Public claims (Not defined in the <a href="https://www.rfc-editor.org/rfc/rfc7519">RFC 7519</a> but published <a href="https://www.iana.org/assignments/jwt/jwt.xhtml">here</a>).- Private claims (Only the JWT producer and JWT consumer knows what the claim stands for). For example, we can define a claim like this: "tenantId":"tenant 123" as a tenant id in our multi-tenant application.<h3 style="text-align:justify;">Signature</h3>We can create our signature using a symmetric or asymmetric algorithm (defined in our header). For example, if we want to use HS256 to generate our signature, the "alg" in our header should have the value "HS256".<h2 style="text-align:justify;">3. Create Jwt using c#</h2>In this section, we will use HMAC-Sha256 hash function to generate the jwt. Before diving into the code, here's the formula of generating the JWT.jwt = Base64Url(header).Base64Url(payload).Base64Url(signature)Given the formula for signature as below:Signature = HmacSha256(key, Base64Url(header).Base64Url(payload))This formula looks a bit messy but really easy to implement. Keep following me along the post, i will try my best to explain everything.First, we will define a class for Payload and Header.<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ilXpkbxZc1HDuddEuppKrX8CVr3Uc538mo2F3YKY.png"></figure>Figure 02: Define the Header and PayloadAnd also a method to generate JWT. This method takes the header, payload and the secret key. The ‘string’ returned encoded using Base64UrlEncode function<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-aKMyQJ8zowFqUfvXbGozI7Nkv9G33vtoTSbYPYwd.png">Figure 03: Define the method’s signature for generating the JWTIn order to generate the first two parts of the JWT, we only need to serialize the header and payload, then encode them using Base64Url accordingly.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-IqKmC8KrmensIUs1GdHd21LiGS8stohUPM1NDYqM.png">Figure 04: Defining the helper methods and use them inside the MakeJwt methodFor Json serialization, we will use the Newtonsoft and set the NamingStrategy to CamelCase. You can use your favorite json serialization library, but remember to set its naming strategy  to CamelCase (refer the header section above).<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-jGT6EMSE75QPc7nVgq5Lt5hAtEhntQGrfAv2uFgo.png">Figure 05: Setting json contract resolver to camel case and the SerializeObject helper method.Generating the signature is a bit trickier. First, we need to create an instance of HMACSHA256 class using the secret key provided. Then we call ComputeHash on the cipher text which was constructed using the first 2 parts of our jwt. Note that the ComputeHash returns the byte[]. We need to use Base64UrlEncode on this byte array to get the string output as the third part. Finally, we do a string interpolation and return it.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-J9Kcth7XMzWwGWAduETa1yvifBG7fUuwZiab7ME0.png">Figure 06: Generating the signature &amp; using it inside the MakeJwt methodWe can use <a href="https://jwt.io/">jwt.io</a> to debug our generated token. Here’s an example of using the code.<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-apkRlNCmN2I2U7ovBqeg1LP8er4SJk0PbmYjcAso.png"></figure>Figure 07: Generating the jwt token using our MakeJwt method<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-QOLbDsNAP0FVFP3PSGu1H2qgsjysR0EuKpv6BboO.png"></figure>Figure 08: Verifying our generated JWT using jwt.io<h2 style="text-align:justify;">4. Parsing and verifying Jwt using c#</h2>Decoding and verifying Jwt is also very simple.First, we split the jwt into 3 parts using '.' as the separator character. The first part should be our header, the second part is the payload and the third part is the signature respectively. We use Base64UrlDecode method to decode the string, then deserialize the first 2 parts to get the header and payload. To verify it, we only need to generate the signature again using the formula above and compare it against the third part of the jwt. In our case, we are using the HS256 algorithm to generate the jwt so we could just hard-code this condition. For general use cases, we must verify using the provided algorithm in the header.We can use the jwt generated by  <a href="https://jwt.io">jwt.io</a> to check our VerifyJwt method as well.<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-JB5FL3spXCzfTfCth3s69L92scya8GbCbcSqgOUP.png"></figure>Figure 09: DecodeBase64Url helper method<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ICu5KgB8Rz3ctSPdGjVE9Z1xDNenVxhxD7AqYY7y.png"></figure>Figure 10: VerifyJwt method<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-WsNoWBJh8yECrqT2iYqdkdEaxlKFALTPZ8DPFiqM.png"></figure>Figure 11: Try generating a token using jwt.io<figure class="image"><img src="https://api.careers.saigontechnology.com/storage/ck-editor/C-Tbg2Zhq81oaPVBfspw4SE9gmjwNBoxPOqRuoH6ET.png"></figure>Figure 12: Verifying the JWT using our VerifyJWT method<h2 style="text-align:justify;">5. Conclusion</h2>In this post, we had an insight into how JWT works, how to implement and also verify it. Hopefully, we will have a dedicated post for JWE in the future as well.You can find the source code <a href="https://github.com/dotransts/jwt_example">here</a>Source code: <a href="https://github.com/dotransts/jwt_example">https://github.com/dotransts/jwt_example</a>Source: <a href="https://saigontechnology.com/blog/json-web-token-using-c">https://saigontechnology.com/blog/json-web-token-using-c</a></body></html>

Json web token is a really good way to transmit data between parties because the sender can be digitally signed using a cryptographic algorithm. However, many of us have used it daily but don’t know how it’s generated and verified internally.
In this blog post, we will explore how it’s created and how to verify it.

Json Web Token Using C#?

<html><body><h2 style="text-align:justify;">1. C# primitive types and String</h2>C#, as an OOP programming language does a very good job at encapsulating the implementation of everything, especially at memory management which hardly leaves any traits which can be guessed from the user perspective. According to this <a href="https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/built-in-types">C# built-in types</a> from Microsoft. They defined String as a reference type along with Object and dynamic types.<h3 style="text-align:justify;">Sizeof Operator</h3>If you run Console.WriteLine(sizeof(int)) the output should be 4 since int occupies 4 bytes in the memory. But if you do the same for String, you will get a compile error error CS0233: 'string' does not have a predefined size, therefore sizeof can only be used in an unsafe context. Since String is a reference type. We cannot know what is the actual size of its instance. It could contain zero character (empty string) or hundreds of characters (a paragraph or a magazine). Nevertheless, we can only inspect the size of its reference if we run our code in an unsafe block.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-O8upD7uOc6wbEXXM4ukAY9nv5JqPonoI1UxYYKNp.png">Figure 1: Running Console.WriteLine(sizeof(string)) in an unsafe block Because a string instance is just a reference to a memory location on the heap. The code above should output 8 on a 64-bit machine and 4 on a 32-bit machine respectively.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-6rgHo5jMb2N2ZETD8RPEHDOtKMkAjKrNXAYWS6VD.png">Figure 2: Visualization of string’s representation in memory (simplified)<h3 style="text-align:justify;">String is reference type but immutable</h3>The behavior of string as a reference type is different from other reference types as well. If you create an instance of a class and change the value of its property, all the variables referring to that object should reflect the changes you have made. However, once a string is created, it can't be changed. See the code below.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-6xbwJqWW6exJCB5JCfPphbZZCtXzqf5p3DqBWutH.png">Figure 3: Normal reference types behavior The secondObject changes accordingly when the underlying object changes. Let’s look at how string type behave:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-07lQCXqj3QmReP7TQHTVuN1gpwB4prR539jGItgZ.png">Figure 04: String immutable proof According to Microsoft <a href="https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#immutability-of-strings">Immutability of string</a>.“String objects are immutable: they can't be changed after they've been created. All of the String methods and C# operators that appear to modify a string actually return the results in a new string object.” The statement above should explain why secondString stays the same when the firstString changes.If we dig deeper into their <a href="https://github.com/microsoft/referencesource/blob/master/mscorlib/system/string.cs#L3187">string.cs source code</a>, we should be able to find that the Concat method actually allocates new memory which has the length equals the total length of the 2 input variables. Then fills it up using these 2 variables and returns the new newly allocated memory.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-UwyjsjEFbOYPY8oByWdVxfXQ3LOhxpYJRWgO8NE6.png">Figure 05: Concat method implementation<h2 style="text-align:justify;">2. Why is string immutable?</h2>I cannot find any official documentation from  Microsoft about why they did it that way. But we can see clearly there are some use cases it could be used to avoid nasty bugs and boost the performance.For example: If we use string as the key for Dictionary. Any changes made to the string variable will not break the Dictionary.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-C4FMyEHC42N122MfFntxp87rOyVxJ4gHRW3fADaz.png">Figure 06: String immutability keeps Dictionary safe Another example I've found is the SubString method could return the result using the indexes of the existing string instead of creating another one. This could save memory and improve performance because we avoided memory allocation. Neither the sub string nor the original string would be changed if perform anything on the sub string. Hence, the behavior of the program should be the same.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-oSTOkmN178KB8YO7WScTu3OUpqJvfehG4X1SdH88.png">Figure 07: SubString method works based on the existing string One last thing I could think of is that we could actually cache these strings. This technique is called Interning and could be done by the framework or manually in our code. We will see in the next section.<h2 style="text-align:justify;">3. String interning</h2>According to Microsoft’s <a href="https://learn.microsoft.com/en-us/dotnet/api/system.string.intern?view=net-7.0#remarks">documentation</a>:“The common language runtime conserves string storage by maintaining a table, called the intern pool, that contains a single reference to each unique literal string declared or created programmatically in your program. Consequently, an instance of a literal string with a particular value only exists once in the system”. The default behavior is that the program only caches the literal string by default. If we want to intern a string programmatically, we must explicitly use the String.Intern method.Here's an example of string interning using String.Intern method<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-BOPjOJso1yLsv1diIutUP0MHQxhR7CsqVljgbVk5.png">Figure 08: String.Intern method caches the string Using an interning pool the right way could reduce the memory usage a lot. For example, if you are writing a program to process text and there could be a huge amount of duplication, interning should be appropriate in this case. However, interning a string at runtime is quite costly and the interned string will last for at least the lifetime of the program (refer <a href="https://learn.microsoft.com/en-us/dotnet/api/system.string.intern?view=net-7.0#performance-considerations">here</a>).<h2 style="text-align:justify;">4. Why do we need StringBuilder class</h2>Let's go back to the + operator. Because a new string is created every single time we make changes to the existing string. If we write code like below:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-ccucuNuKYgfa0eZ5xAcmB6USdWkdZv7b37RNWC57.png">Figure 9: Bad string concatenation example This code would not be efficient because we keep allocating a new string every time we concatenate the result. Doing this on a large scale could degrade the performance drastically. To address the problem, Microsoft came up with a StringBuilder class.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-qUMhqxRDpivu6pEYTXbRgsMQAfWpdqL8kFzi3BQW.png">Figure 10: Using StringBuilder instead of string concatenation<h2 style="text-align:justify;">5. A glance at StringBuilder implementation and why it's better than string concatenation</h2>In this section, we will discover how the StringBuilder works. We will use the code below as our example.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-hQaMo37ON519If6F29pxilJX9HDq12xMypR3WtT7.png">Figure 11: Sample of StringBuilder’s usage And StringBuilder’s implementation on <a href="https://github.com/microsoft/referencesource/blob/master/mscorlib/system/text/stringbuilder.cs">github</a> to see why it's better than string concatenation.First, the StringBuilder class contains some members<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-fpyPu5ktyhLbunp42RK0FNz3Hccgb7Kmp9QwcfsQ.png">Figure 12: StringBuilder’s members If we use the empty constructor as line 1 in our example, it will boil down to the constructor which takes 3 parameters with 3 default parameters and assigns values for these members above. In our case:<ul><li style="text-align:justify;">the m_ChunkChars is initialized to an array which contains 16 empty elements.</li><li style="text-align:justify;">m_ChunkLength is set to 0.</li><li style="text-align:justify;">m_ChunkOffset is set to 0.</li><li style="text-align:justify;">m_ChunkPrevious is null</li></ul><img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-BQjCOTthdSYw7RsiLNVFNEM0gcNrhPmQjAzSMfkz.png">Figure 13: StringBuilder constructor with default parameters The ThreadSafeCopy performs copying the initial string to the empty m_memoryChunks in the memory buffer (we don't have any initial value in this case).<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-YPcIpTKH4BVXhabEzSEzSLq71J47IJPyOysSyxFs.png">Figure 14: StringBuilder visualization  If you are curious, here's the definition of the ThreadSafeCopy method. <img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-exwZZszmmXwJXNfiUaI7HMhHFrX82Z91SKjM8Mzw.png">Figure 15: ThreadSafeCopy method’s signature On the second line of our code, we call Append("Hello, World!"). This method checks the inner buffer (m_ChunkChars) to see if it’s able to contain the whole string. If the current buffer cannot contain the whole string, it will fill up the current buffer and create a new StringBuilder with the appropriate length to fill the rest.In our case, the m_ChunkChars has a length of 16 characters, obviously we can hold the whole string of 12 characters without any problems. Hence, we don't need to allocate more memory. m_ChunkLength should be updated accordingly.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-hNQFWXBcz9iJ3YoIB2usEz2GAE9nurCY8Y4mouXb.png">Figure 16: Calling Append(“Hello World!”) on StringBuilder The third line of our code is a bit trickier, because " World!” has a length of 7, our m_ChunkChars doesn’t have enough space to contain the whole string. In this case, we fill the rest of m_ChunkChars and allocate more memory for the rest of the string.Refer the below codes:<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-EJoImUPwWLCH0kwYhhoU3K4F6zVPY10J7TB8bSUi.png">Figure 17: Append method’s implementation Note that we call AppendHelper inside Append. This method simply just calls the Append(char* value, int valueCount) using " World!" and 7 as input parameters.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-eKzfHc9oSqZsPDSPZDQRLPRld0t6Cnv70Cv1eFg3.png">Figure 18: Append method’s implementation Inside the Append(char*, int), the ExpandByABlock is called with the restLength (3 in this case). This function allocates new StringBuilder with the new m_ChunkChars at least enough to hold the rest of the string and lower than the threshold 80000 elements.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-9jQPY02ilkx8xZOoYriaGd9kXyXTa8vbaCVRt3AE.png">Figure 19: ExpandABlock method’s implementation Math.Max(3, Math.Min(16, 8000)) returns 16. Hence, we are doubling the capacity by creating a new StringBuilder with the buffer 16.<img src="https://api.careers.saigontechnology.com/storage/ck-editor/CKEDITOR-11ovO71bs4BIKH9cvT3wEWTg0IBE0Ajy6JnLaKtO.png">Figure 20: Calling Append(“ World!”) on third line After the third line, we created a new StringBuilder and modified the m_ChunkPrevious to point to that StringBuilder. m_ChunkOffset is set to 16, m_ChunkLength is set to 0 as well.By doing this way, we gain more performance than string concatenation because there's no need to allocate spaces for duplicate string.We also pre-allocate the memory so we don't have to do that every time the string gets concatenated.<h2 style="text-align:justify;">6. Conclusion</h2>I hope you understand how String and StringBuilder work internally and hope you like Saigon Technology Tech Blog. </body></html>

If you have been writing c# for a while, you must have used String (string and String can be used interchangeably because string is just an alias of String) and StringBuilder class for sure. However, underneath the simple ‘+’ operator is a huge amount of work which was designed for good intention but also could degrade performance drastically if used naively. The StringBuilder class  was crafted in order to overcome one of those problems. Do you know what these problems are and what was resolved? 
Let’s find out in this post!

Fault tolerance patterns for microservices

Related articles

OTHER ARTICLES FROM DO TRAN