Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
121420 stories
·
29 followers

How We Built Slack AI To Be Secure and Private

1 Share

At Slack, we’ve long been conservative technologists. In other words, when we invest in leveraging a new category of infrastructure, we do it rigorously. We’ve done this since we debuted machine learning-powered features in 2016, and we’ve developed a robust process and skilled team in the space.

Despite that, over the past year we’ve been blown away by the increase in capability of commercially available large language models (LLMs) — and more importantly, the difference they could make for our users’ biggest pain points. Too much to read? Too hard to find stuff? Not anymore — 90% of users who adopted AI reported a higher level of productivity than those who didn’t.

But as with any new technology, our ability to launch a product with AI is predicated on finding an implementation that meets Slack’s rigorous standards for customer data stewardship. So we set out to build not just awesome AI features, but awesome and trusted AI.

The generative model industry is quite young; it is still largely research-focused, and not enterprise-customer focused. There were few existing enterprise-grade security and privacy patterns for us to leverage when building out the new Slack AI architecture.

Instead, to inform how we built out Slack AI, we started from first principles. We began with our requirements: upholding our existing security and compliance offerings, as well as our privacy principles like “Customer Data is sacrosanct.” Then, through the specific lens of generative AI, our team created a new set of Slack AI principles to guide us.

  • Customer data never leaves Slack.
  • We do not train large language models (LLMs) on customer data.
  • Slack AI only operates on the data that the user can already see.
  • Slack AI upholds all of Slack’s enterprise-grade security and compliance requirements.

These principles made designing our architecture clearer, although sometimes more challenging. We’ll walk through how each of these informed what Slack AI looks like today.

Customer data never leaves Slack

The first, and perhaps most important, decision we faced was how to ensure that we could use a top-tier foundational model while never allowing customer data to leave Slack-controlled VPCs. In the generative model industry, most customers of foundational models were calling the hosted services directly, and alternative options were scarce.

We knew this approach wouldn’t work for us. Slack, and our customers, have high expectations around data ownership. In particular, Slack is FedRAMP High authorized, which confers specific compliance requirements, including not sending customer data outside of our trust boundary. We wanted to ensure our data didn’t leave our AWS Virtual Private Cloud (VPC) so that we could guarantee that third parties would not have the ability to retain it or train on it.

So we began to look for creative solutions where we could host a foundational model on our own infrastructure. However, most foundational models are closed-source: Their models are their secret sauce, and they don’t like to hand them to customers to deploy on their own hardware.

Fortunately, AWS has an offering where it can be the trusted broker between foundational model provider and customer: AWS SageMaker. By using SageMaker, we are able to host and deploy closed-source large language models (LLMs) in an escrow VPC, allowing us to control the lifecycle of our customers’ data and ensure the model provider has no access to Slack’s customers’ data.

And there we had it: We had access to a top tier foundational model, hosted in our own AWS VPC, giving us assurances on our customer data.

Slack AI architecture diagram

We do not train large language models (LLMs) on customer data

The next decision was also key: We chose to use off-the-shelf models instead of training or fine-tuning models. We’ve had privacy principles in place since we began employing more traditional machine learning (ML) models in Slack, like the ones that rank search results. Among these principles are that data will not leak across workspaces, and that we offer customers a choice around these practices; we felt that, with the current, young state of this industry and technology, we couldn’t make strong enough guarantees on these practices if we trained a generative AI model using Slack’s customers’ data.

So we made the choice to use off-the-shelf models in a stateless way by employing Retrieval Augmented Generation (RAG). With RAG, you include all of the context needed to perform a task within each request, so the model does not retain any of that data. For example, when summarizing a channel, we’ll send the LLM a prompt containing the messages to be summarized, along with instructions for how to do so. The statelessness of RAG is a huge privacy benefit, but it’s a product benefit as well. All of Slack AI’s results are grounded in your company’s knowledge base — not the public Internet – which makes the results more relevant and accurate. You get the benefit of incorporating your proprietary and individual data set without the risk of a model retaining that data.

Using RAG can narrow down the set of models you can use; they need to have “context windows” large enough for you to pass in all the data you want to use in your task. Additionally, the more context you send an LLM, the slower your request will be, as the model needs to process more data. As you can imagine, the task of summarizing all messages in a channel can involve quite a bit of data.

This posed a challenge for us: Find a top-tier model with a large context window with fairly low latency. We evaluated a number of models and found one that suited our first use cases, summarization and search, well. There was room for improvement, though, and we began a long journey of both prompt tuning and chaining more traditional ML models with the generative models to improve the results.

RAG is getting easier and faster with each iteration of models: Context windows are growing, as is the models’ ability to synthesize data across a large context window. We’re confident that this approach can get us both the quality we’re aiming for while helping ensure our customers’ data is protected.

Slack AI only operates on the data that the user can already see

It is one of our core tenets that Slack AI can only see the same data that the requesting user can see. Slack AI’s search feature, for example, will never surface any results to the user that standard search would not. Summaries will never summarize content that the user could not otherwise see while reading channels.

We ensure this by using the requesting user’s Access Control List (ACLs) when fetching the data to summarize or search and by leveraging our existing libraries that fetch the data to display in channel or on the search results page.

This wasn’t hard to do, technically speaking, but it needed to be an explicit choice; the best way to guarantee this was to build on top of, and reuse, Slack’s core feature sets while adding some AI magic at the end.

It’s worth noting, too, that only the user who invokes Slack AI can see the AI-generated output. This builds confidence that Slack is your trusted AI partner: Only the data that you can see goes in, and then only you can see the output.

Slack AI upholds all of Slack’s enterprise-grade security and compliance requirements

There’s no Slack AI without Slack, so we ensured that we integrated all of our enterprise grade compliance and security offerings. We follow the principle of least data: We store only the data needed to complete the task, and only for the duration necessary.

Sometimes the least data is: None. Where possible, Slack AI’s outputs are ephemeral: Conversation summaries and search answers all generate point-in-time responses that are not stored on disk.

Where that’s not possible, we reused as much of Slack’s existing compliance infrastructure as possible, and built new support where we had to. Many of our compliance offerings come built in with our existing infrastructure, such as Encryption Key Management and International Data Residency. For others, we built in special support to make sure that derived content, like summaries, are aware of the messages that went into them; for example, if a message is tombstoned because of Data Loss Protection (DLP), any summaries derived from that message are invalidated. This makes DLP and other administrative controls powerful with Slack AI: Where these controls were already active on Slack’s message content, they are also active Slack AI outputs.


Whew — that was a long journey! And I didn’t even get to take you through how we build prompts, evaluate models, or handle spiky demand; we’ll save that for next time. But I’m glad we started here, with security and privacy: We want our customers to know how seriously we take protecting their data, and how we’re safeguarding it each step of the way.

 

Interested in helping us build Slack's AI capabilities? We're hiring! Apply now

The post How We Built Slack AI To Be Secure and Private appeared first on Slack Engineering.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Single, Short, Specific - Prompting GitHub Copilot for Visual Studio

1 Share

When we talk about AI, we have a relative long winded pitch about how the quality of the data you use to train your model is critical to the quality of the outputs that the model will create. If your data is bad, the outputs won't be good. The short version of this is Garbage in - Garbage out which is definitely shorter and easier to remember. A good prompt is often the key for a good answer. In the new short video I just published, Gwyn "GPS" Peña-Siguenza shows us that "short" shouldn't mean "too short" and how a good specific prompt looks like.

 

 

The art of prompting

When it comes to Large Language Models (LLMs), the above is definitely true, but it also applies to the prompt, i.e the message that you are using to instruct the LLM to give you an answer. That's why we sometimes talk about Prompt engineering although that term is slowly being abandoned. What remains is the importance of creating good prompts in order to get good results.

I sometimes compare an LLM to a moody teenager. If you tell your kid "clean your room", chances are nothing will happen. But if you refine the prompt and tell them to "clean your room, NOW", things might happen. The key is in the specifics.

 

Single, Short, Specific

One moniker which is easy to remember when you work with GitHub Copilot for Visual Studio is that your prompt should be "single, short, specific".

  • Single: Don't try to pack too many instructions in your prompt.
  • Short: Don't be too long winded, which can be confusing for the model.
  • Specific: Instruct specifically for what you are trying to achieve.

We all know this from our interactions with search engines. When we look for something, it can take a few tries before we get it "just right" and get the results we were looking for. The same applies to prompting.

 

Too short...

However it's also possible to be too short. In the video posted above, Gwyn shows that her first prompt ("Some code") doesn't yield any result. It was just too short, and not specific enough and GitHub Copilot doesn't know where to start. But on the second attempt ("Return cities of a provided country"), Copilot returns the correct code in two steps, starting with the attribute needed for the API endpoint to be created, and then the code itself. Gwyn can then check that the code looks OK, and accept it, before testing it.

 

Getting proficient at prompting

As our jobs are evolving with new tools, it's very important that to get proficient with their features. Just like a carpenter getting a new fancy electrical saw, the best is to spend some time learning its features, or they might lose a finger! The risk is lower for us software developer, but we still need to learn how the new tools work, and what works best with them.

There are multiple places to start your learning journey, such as Microsoft Learn where we have whole learning paths showing how to get the best from GitHub Copilot. For example Introduction to prompt engineering with GitHub Copilot is a great place to start.

We have more reference material in our collection here, and of course a great place to go is also the full video with Gwyn that is published here.

 

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Ep. #145, The Future of Collaborative Docs with Cara Marin of Stashpad

1 Share

In episode 145 of Jamstack Radio, Brian speaks with Cara Marin of Stashpad about collaborative documents. Together they explore the shortcomings of popular collaborative tools on the market today and how they can be improved. Additionally, Cara unpacks local-first technology and how it enables elevated real-time collaboration experiences.

The post appeared first on Heavybit.





Download audio: https://media.blubrry.com/heavybit/d3aeja1uqhkije.cloudfront.net/podcasts/jamstack-radio/20240308-jamstack-radio-145.mp3
Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Baseline Styling in BenchmarkDotNet

1 Share

In this article, we will learn how to set a baseline for our benchmark in BenchmarkDotNet and discuss styling the baseline results. As software developers, benchmarking the performance of our methods is a crucial part of our day-to-day activities. With the BenchmarkDotNet library, we can accurately perform this operation. When we use this library for […]

The post Baseline Styling in BenchmarkDotNet appeared first on Code Maze.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

ASP.NET Core Basics: Getting Started with LINQ

1 Share

Manipulating data is a common task for ASP.NET Core developers, and LINQ is a powerful feature that allows you to write queries directly in C# language syntax. Check out this post on how to start using LINQ in simple queries to more complex scenarios.

Building a web application can be challenging, especially for beginners. To simplify the common tasks of accessing and manipulating data in this type of application, developers can use resources such as LINQ, which has functions for efficient data manipulation.

In this post, we will cover the process of creating a simple application using ASP.NET Core that leverages the power of LINQ to manipulate data simply and effectively. By the end of this tutorial, you will have a solid understanding of how to build a simple yet functional web application using the best of LINQ.

What Is LINQ?

LINQ, short for Language Integrated Query, is a language feature in C# that allows developers to write queries directly in the language’s syntax. In the context of ASP.NET Core, LINQ has become an invaluable tool for working with data collections.

LINQ provides a consistent model for querying and manipulating data, regardless of the data source. It allows developers to write queries using familiar syntax, making code more expressive and readable. LINQ is not limited to working with databases—it can be used with various data sources such as arrays, collections, XML and more.

In ASP.NET Core, LINQ is commonly used to query databases, especially with Entity Framework Core, a popular object-relational mapping (ORM) framework.

EF Core allows developers to interact with databases using C# objects, and LINQ provides a natural and efficient way to query and transform this data. Another advantage of LINQ is that it is part of the native ASP.NET Core namespace (System.Linq). This means that all its resources are available without the use of third-party libraries, receiving updates with each new version of .NET.

The image below demonstrates how it is possible to optimize the C# syntax using the LINQ approach instead of the traditional approach.

Traditional approach VS LINQ approach

Practicing LINQ in an ASP.NET Core Application

To practice using LINQ, we will create a simple application using SQLite as a database and EF Core as an ORM to register students. Then we will see the main LINQ resources available for data manipulation.

To create the example application, you need to have the latest version of .NET. This post uses Version 8. For an IDE, in this post uses Visual Studio Code. It is also necessary to have EF Core installed locally; to install it, just run the command dotnet tool install --global dotnet-ef in the terminal.

You can access the complete source code here: Contact Hub source code.

Open a terminal and execute the following commands to create the app and install the NuGet packages.

dotnet new web -o StudentHub

cd StudentHub

dotnet add package Microsoft.EntityFrameworkCore.Sqlite

dotnet add package Microsoft.EntityFrameworkCore.Design

dotnet add package Swashbuckle.AspNetCore

Now let’s create classes to represent the student, course and enrollment entities. Create a new folder called “Models,” and within it create the classes below:

  • Student
namespace StudentHub.Models;

public class Student
{
  public Guid Id { get; set; }
  public string FirstName { get; set; }
  public string LastName { get; set; }
  public DateTime DateOfBirth { get; set; }
  public string Address { get; set; }
  public string PhoneNumber { get; set; }
  public string Email { get; set; }
  public string Gender { get; set; }
  public int Age { get; set; }
  public List<string> Courses { get; set; }
  public double Score { get; set; }
  
  public Student()
  {
  }
}
  • Course
namespace StudentHub.Models;

public class Course
{
  public Guid CourseId { get; set; }
  public Guid CourseName { get; set; }
}
  • Enrollment
namespace StudentHub.Models;

public class Enrollment
{
  public Guid EnrollmentId { get; set; }
  public Guid StudentId { get; set; }
  public Guid CourseId { get; set; }
}

The next step is to create the context class that will implement the connection to the database and instantiate the list of student records. So, create a folder called “Data” and inside it create the class below:

  • StudentDbContext
using Microsoft.EntityFrameworkCore;
using StudentHub.Models;

namespace StudentHub.Data;

public class StudentDbContext : DbContext
{
  public DbSet<Student> Students { get; set; }
  public DbSet<Course> Courses { get; set; }
  public DbSet<Enrollment> Enrollments { get; set; }

  public StudentDbContext(DbContextOptions<StudentDbContext> options)
    : base(options)
  {
  }
}

LINQ Standard Query Operators

LINQ Standard Query Operators are methods and expressions that allow you to perform query operations on data collections. These operators are used to filter, sort, group and project data in LINQ queries. They are an integral part of the C# language and other .NET languages that support LINQ.

Standard query operators are divided into several categories, each performing a specific operation on a sequence (collection) of elements.

Next, we will get to know each of the patterns and check a practical example of each of them. Don’t worry about the project for now—at the end of the explanation we will implement everything into the example code.

1. Filtering

To filter elements using LINQ, we use the “Where” extension method. Example:

public IEnumerable<Student> GetMaleStudents()
{
  var result = from student in _db.Students
    where student.Gender == "Male"
    select student;
  
  return result;
}

Note that in this code the variable “result” is created to store the result of the LINQ query from student in _db.Students where student.Gender == "Male", which filters students to include only those whose Gender attribute is equal to " Male". Then the code select student is declared, which indicates that the query must return the student object for each item that meets the specified criteria.

2. Projection

Projection refers to the ability to transform or select specific properties of objects in a sequence. In LINQ, projection is mainly performed using the Select operator. The goal is to create a new form of data containing only the information needed for the task.

The Select operator allows you to specify the projection of data from a collection to a new composition.

public IEnumerable<string> GetStudentFullNames()
{
  var result = from student in _db.Students
select $"{student.FirstName} {student.LastName}";

  return result;
}

The code above creates a sequence of strings containing the students’ full names, selecting only the FirstName and LastName properties through the LINQ select operator.

3. Ordering

Element ordering in LINQ is done using the OrderBy and OrderByDescending operators, which classify the elements of a sequence based on specific criteria. Both operators are used in conjunction with the select clause to perform the projection of the ordered elements.

OrderBy:

The OrderBy operator is used to sort the elements of a sequence in ascending order based on a specified key. The key is usually a property of the object or an expression that returns a value that will be used to determine the order.

public IEnumerable<Student> GetStudentsOrderedByName()
{
  var result = _db.Students.OrderBy(student => student.FirstName);

  return result;
}

In this example, OrderBy(student => student.FirstName) sorts the students in ascending order. In this case, it will sort the students alphabetically from A to Z.

OrderByDescending:

The OrderByDescending operator works similarly to OrderBy but sorts elements in descending order.

public IEnumerable<Student> GetStudentsByDescendingName()
{
  var result = _db.Students.OrderByDescending(student => student.FirstName);

  return result;
}

In this example, OrderByDescending(student => student.FirstName) sorts numbers in descending order.

ThenBy:

The ThenBy method is used in LINQ to perform a secondary sort on an already sorted sequence. In other words, you can use ThenBy to specify a second ordering criterion when the elements have the same value as the first criterion.

public IEnumerable<Student> GetStudentsOrderedByNameAndThenByAge()
{
  var result = _db.Students.OrderBy(student => student.FirstName).ThenBy(student => student.Age);

  return result;
}

In this example, students are first ordered by last name (OrderBy(student => student.FirstName)), and then, if two students have the same name, they will be ordered by age (ThenBy(student => student.Age)).

4. Grouping

The grouping operator in LINQ is represented by the GroupBy method. It is used to group elements of a sequence based on a specific key. The grouping operation creates groups of elements that share the same key. Example:

public IEnumerable<IGrouping<int, Student>> GroupStudentsByAge()
{
  var result = _db.Students.GroupBy(student => student.Age);

  return result;
}

In the example above, the GroupBy operator is used to group students based on their age. It takes a lambda expression that specifies the grouping key, in this case student => student.Age.

The GroupStudentsByAge method, when called, will return a sequence of groups, where each group represents an age range and contains a collection of students with that age.

GroupJoin:

Performs a join between two sequences and groups the results.

public IEnumerable<IGrouping<string, Student>> GroupStudentsByCourses()
{
  return _db.Students
    .GroupJoin(
      _db.Enrollments,
      student => student.Id,
      enrollment => enrollment.StudentId,
      (student, enrollments) => new { student, enrollments }
    )
    .SelectMany(
      x => x.enrollments.DefaultIfEmpty(),
      (x, enrollment) => new { Student = x.student, Enrollment = enrollment }
    )
    .GroupBy(x => x.Enrollment.CourseId.ToString(), x => x.Student);
}

ToLookup: Converts a string to a Lookup object.

public ILookup<Guid, Student> ToLookupByStudentId()
{
  return _db.Students.ToLookup(student => student.Id);
}

5. Junction

The join operation in LINQ is performed using the join operator. It allows you to combine elements from two different sequences based on a specified condition, producing a new sequence of corresponding elements. Example:

public IEnumerable<string> GetStudentCourseNames()
{
  var result = from student in _db.Students
    join enrollment in _db.Enrollments on student.Id equals enrollment.StudentId
    join course in _db.Courses on enrollment.CourseId equals course.CourseId
    select $"{student.FirstName} {student.LastName} - {course.CourseName}";
  return result;
}

The GetStudentCourseNames method uses the join operation between three tables: Students, Enrollments and Courses. The objective is to obtain a sequence of strings that represent the names of students and the names of the courses in which they are enrolled. This is an effective way to get related data from multiple tables in a single LINQ query.

6. Aggregation

In LINQ, aggregation refers to the application of operations that combine elements of a sequence into a single value. Aggregation operators are used to perform calculations on a sequence, such as adding, counting, finding the minimum or maximum, calculating the average, etc. Some of the most common aggregation operators in LINQ include Count, Sum, Min, Max and Average.

Example:

var numbers = new List<int> { 1, 2, 3, 4, 5 };

var sum = numbers.Sum(); // 1 + 2 + 3 + 4 + 5 = 15
var count = numbers.Count(); // Number of elements = 5
var minimum = numbers.Min(); // Minimum value = 1
var maximum = numbers.Max(); // Maximum value = 5
var average = numbers.Average(); // Average = (1 + 2 + 3 + 4 + 5) / 5 = 3

7. Partitioning

The “Partitioning” category in LINQ refers to operators that allow you to divide a sequence into smaller partitions. The main operator in this category is Skip and Take.

Take:

Returns a specified number of elements from the beginning of a sequence.

Example:

public IEnumerable<Student> GetFirstTwoStudents()
{
  var result = _db.Students.Take(2);

  return result;
}

Skip:

Skips a specified number of elements at the beginning of a sequence and returns the remaining elements.

Example:

public IEnumerable<Student> GetStudentsAfterSkippingFirstTwo()
{
  var result = _db.Students.Skip(2);

  return result;
}

8. Additional Filtering Operators

There are additional filtering operators such as OfType and Distinct.

OfType:

Filters the elements of a sequence to include only those of a certain type.

public IEnumerable<Student> FilterStudentsByType()
{
  return _db.Students.OfType<Student>();
}

Distinct:

Returns distinct elements from a sequence.

public IEnumerable<string> GetDistinctCourses()
{
  return _db.Students
    .AsEnumerable()
    .SelectMany(student => student.Courses)
    .Distinct();
}

9. Set Operators

Set Operators in LINQ perform operations between sets of elements, such as union, intersection and difference. They are particularly useful when working with two or more sequences and want to perform operations that involve combining or comparing elements between them.

Union:

Returns the union of two sequences.

public IEnumerable<string> UnionStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
{
  var unionList = firstList.Union(secondList);
  return unionList.Select(student => $"{student.FirstName} {student.LastName}");
}

Intersect:

Returns the intersection of two sequences.

public IEnumerable<string> IntersectStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
{
  var intersectList = firstList.Intersect(secondList);
  return intersectList.Select(student => $"{student.FirstName} {student.LastName}");
}

Except:

Returns elements that are in the first sequence but not in the second.

public IEnumerable<string> ExceptStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
{
  var exceptList = firstList.Except(secondList);
  return exceptList.Select(student => $"{student.FirstName} {student.LastName}");
}

10. Quantification Operators

Quantification operators in LINQ are used to evaluate conditions concerning all or part of the elements in a sequence. There are two main operators in this category: All and Any.

All:

Checks whether all elements meet a condition.

public bool CheckIfAllStudentsPassed()
{
  return _db.Students.All(student => student.Score >= 60);
}

Any:

Checks whether at least one element meets a condition.

public bool CheckIfAnyStudentFailed()
{
  return _db.Students.Any(student => student.Score < 60);
}

Continuing the Example Application

These were the main LINQ Standard Query Operators. Now that we have seen an example of each one, let’s implement them in the application and create endpoints to access them.

Create a new folder called “Services” and inside it create the class below:

  • StudentService
using StudentHub.Data;
using StudentHub.Models;

namespace StudentHub.Services;
public class StudentService
{
  private readonly StudentDbContext _db;

  public StudentService(StudentDbContext db)
  {
    _db = db;
  }

  public async Task CreateStudent(Student student)
  {
    _db.Students.Add(student);
    await _db.SaveChangesAsync();
  }

  // Filtering
  public IEnumerable<Student> GetMaleStudents()
  {
    var result = from student in _db.Students
      where student.Gender == "Male"
      select student;

    return result;
  }

  public IEnumerable<Student> GetFemaleStudents()
  {
    var result = from student in _db.Students
      where student.Gender == "Female"
      select student;
   
    return result;
  }

  public IEnumerable<Student> GetOthersStudents()
  {
    var result = from student in _db.Students
      where student.Gender == "Others"
      select student;

    return result;
  }
  
  // OfType
  public IEnumerable<Student> GetMaleStudentsOnly()
  {
    var result = _db.Students.OfType<Student>().Where(s => s.Gender == "Male");

    return result;
  }

  public IEnumerable<Student> GetFemaleStudentsOnly()
  {
    var result = _db.Students.OfType<Student>().Where(s => s.Gender == "Female");

    return result;
  }

  public IEnumerable<Student> GetOthersStudentsOnly()
  {
    var result = _db.Students.OfType<Student>().Where(s => s.Gender == "Others");

    return result;
  }

  // Projection
  public IEnumerable<string> GetStudentFullNames()
  {
    var result = from student in _db.Students
      select $"{student.FirstName} {student.LastName}";

    return result;
  }

  // SelectMany
  public IEnumerable<string> GetCoursesForAllStudents()
  {
    var result = _db.Students
      .AsEnumerable()
      .SelectMany(student => student.Courses)
      .ToList();

    return result;
  }

  // Partitioning Take
  public IEnumerable<Student> GetFirstTwoStudents()
  {
    var result = _db.Students.Take(2);

    return result;
  }

  // Partitioning Skip
  public IEnumerable<Student> GetStudentsAfterSkippingFirstTwo()
  {
    var result = _db.Students.Skip(2);

    return result;
  }

  // Ordering OrderBy
  public IEnumerable<Student> GetStudentsOrderedByName()
  {
    var result = _db.Students.OrderBy(student => student.FirstName);

    return result;
  }

  // Ordering ByDescending
  public IEnumerable<Student> GetStudentsByDescendingName()
  {
    var result = _db.Students.OrderByDescending(student => student.FirstName);

    return result;
  }

  // ThenBy
  public IEnumerable<Student> GetStudentsOrderedByNameAndThenByAge()
  {
    var result = _db.Students.OrderBy(student => student.FirstName).ThenBy(student => student.Age);

    return result;
  }

  // Reverse
  public IEnumerable<Student> GetStudentsReversed()
  {
    var result = _db.Students
      .AsEnumerable()
      .Reverse();

    return result;
  }

  // Grouping
  public IEnumerable<IGrouping<int, Student>> GroupStudentsByAge()
  {
    var result = _db.Students.GroupBy(student => student.Age);

    return result;
  }

  public IEnumerable<IGrouping<string, Student>> GroupStudentsByCourses()
  {
    return _db.Students
      .GroupJoin(
        _db.Enrollments,
        student => student.Id,
        enrollment => enrollment.StudentId,
        (student, enrollments) => new { student, enrollments }
      )
      .SelectMany(
        x => x.enrollments.DefaultIfEmpty(),
        (x, enrollment) => new { Student = x.student, Enrollment = enrollment }
      )
      .GroupBy(x => x.Enrollment.CourseId.ToString(), x => x.Student);
  }

  public ILookup<Guid, Student> ToLookupByStudentId()
  {
    return _db.Students.ToLookup(student => student.Id);
  }

  //Join
  public IEnumerable<string> GetStudentCourseNames()
  {
    var result = from student in _db.Students
      join enrollment in _db.Enrollments on student.Id equals enrollment.StudentId
      join course in _db.Courses on enrollment.CourseId equals course.CourseId
      select $"{student.FirstName} {student.LastName} - {course.CourseName}";

    return result;
  }

  // Conversion
  public Student[] ConvertToStudentArray()
  {
    var result = _db.Students.ToArray();

    return result;
  }

  // Element
  public Student GetFirstStudent()
  {
    var result = _db.Students.First();

    return result;
  }

  // Aggregation
  public double GetAverageScore()
  {
    var result = _db.Students.Average(student => student.Score);

    return result;
  }

  // Filtering aditional Operators
  public IEnumerable<Student> FilterStudentsByType()
  {
    return _db.Students.OfType<Student>();
  }

  public IEnumerable<string> GetDistinctCourses()
  {
    return _db.Students
      .AsEnumerable()
      .SelectMany(student => student.Courses).Distinct();
  }

  // Set Operators
  public IEnumerable<string> UnionStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
  {
    var unionList = firstList.Union(secondList);
    return unionList.Select(student => $"{student.FirstName} {student.LastName}");
  }

  public IEnumerable<string> IntersectStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
  {
    var intersectList = firstList.Intersect(secondList);
    return intersectList.Select(student => $"{student.FirstName} {student.LastName}");
  }

  public IEnumerable<string> ExceptStudentNames(IEnumerable<Student> firstList, IEnumerable<Student> secondList)
  {
    var exceptList = firstList.Except(secondList);
    return exceptList.Select(student => $"{student.FirstName} {student.LastName}");
  }

  // Quantification Operators
  public bool CheckIfAllStudentsPassed()
  {
    return _db.Students.All(student => student.Score >= 60);
  }

  public bool CheckIfAnyStudentFailed()
  {
    return _db.Students.Any(student => student.Score < 60);
  }
}

The code above contains all the methods discussed previously.

Now let’s create the endpoints that will call these methods, in addition to the dependency injection configurations. To do this, replace the existing code in the Program.cs file with the code below:

using Microsoft.EntityFrameworkCore;
using Microsoft.OpenApi.Models;
using StudentHub.Data;
using StudentHub.Models;
using StudentHub.Services;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddDbContext<StudentDbContext>(options =>
{
  options.UseSqlite("Data Source=students_db.db");
});
builder.Services.AddTransient<StudentService>();
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen(c =>
{
  c.SwaggerDoc("v1", new OpenApiInfo { Title = "StudentHub", Version = "v1" });
});

var app = builder.Build();

app.UseSwagger();
app.UseSwaggerUI(c =>
{
  c.SwaggerEndpoint("/swagger/v1/swagger.json", "StudentHub API V1");
  c.RoutePrefix = string.Empty;
});

// Map API endpoints
app.MapPost("/api/students", async (Student student, StudentService studentService) =>
{
  await studentService.CreateStudent(student);
  return Results.Ok();
});

app.MapGet("/api/students/male", (StudentService studentService) => studentService.GetMaleStudents());

app.MapGet("/api/students/female", (StudentService studentService) => studentService.GetFemaleStudents());

app.MapGet("/api/students/others", (StudentService studentService) => studentService.GetOthersStudents());

app.MapGet("/api/students/male-only", (StudentService studentService) => studentService.GetMaleStudentsOnly());

app.MapGet("/api/students/female-only", (StudentService studentService) => studentService.GetFemaleStudentsOnly());

app.MapGet("/api/students/others-only", (StudentService studentService) => studentService.GetOthersStudentsOnly());

app.MapGet("/api/students/full-names", (StudentService studentService) => studentService.GetStudentFullNames());

app.MapGet("/api/students/courses", (StudentService studentService) => studentService.GetCoursesForAllStudents());

app.MapGet("/api/students/first-two", (StudentService studentService) => studentService.GetFirstTwoStudents());

app.MapGet("/api/students/ordered-by-name", (StudentService studentService) => studentService.GetStudentsOrderedByName());

app.MapGet("/api/students/ordered-by-name-age", (StudentService studentService) => studentService.GetStudentsOrderedByNameAndThenByAge());

app.MapGet("/api/students/reversed", (StudentService studentService) => studentService.GetStudentsReversed());

app.MapGet("/api/students/grouped-by-age", (StudentService studentService) => studentService.GroupStudentsByAge());

app.MapGet("/api/students/to-array", (StudentService studentService) => studentService.ConvertToStudentArray());

app.MapGet("/api/students/first", (StudentService studentService) => studentService.GetFirstStudent());

app.MapGet("/api/students/average-score", (StudentService studentService) => studentService.GetAverageScore());

app.MapGet("/api/students/filterByType", (StudentService studentService) => studentService.FilterStudentsByType());

app.MapGet("/api/students/distinctCourses", (StudentService studentService) => studentService.GetDistinctCourses());

app.MapGet("/api/students/unionNames", (StudentService studentService, IEnumerable<Student> firstList, IEnumerable<Student> secondList) => studentService.UnionStudentNames(firstList, secondList));

app.MapGet("/api/students/intersectNames", (StudentService studentService, IEnumerable<Student> firstList, IEnumerable<Student> secondList) => studentService.IntersectStudentNames(firstList, secondList));

app.MapGet("/api/students/exceptNames", (StudentService studentService, IEnumerable<Student> firstList, IEnumerable<Student> secondList) => studentService.ExceptStudentNames(firstList, secondList));

app.MapGet("/api/students/allPassed", (StudentService studentService) => studentService.CheckIfAllStudentsPassed());

app.MapGet("/api/students/anyFailed", (StudentService studentService) => studentService.CheckIfAnyStudentFailed());

app.MapGet("/api/students/groupByCourses", (StudentService studentService) => studentService.GroupStudentsByCourses());

app.MapGet("/api/students/toLookupByStudentId", (StudentService studentService) => studentService.ToLookupByStudentId());

app.MapGet("/api/students/studentCourseNames", (StudentService studentService) => studentService.GetStudentCourseNames());

app.MapGet("/api/students/descendingNames", (StudentService studentService) => studentService.GetStudentsByDescendingName());

app.MapGet("/api/students/skipFirstTwo", (StudentService studentService) => studentService.GetStudentsAfterSkippingFirstTwo());

app.Run();

Our application is almost ready. We still need to execute the EF Core migration commands to generate the database and tables. Open a terminal within the application and execute the following commands.

  1. dotnet ef migrations add InitialCreate

  2. dotnet ef database update

Finally, we are ready to test the LINQ functions. To do this, simply run the application with the command dotnet run and access the Swagger interface in the browser: http://localhost:PORT/index.html.

Then we can perform some operations as shown in the GIF below:

Application running

Considerations When Using LINQ

LINQ is a powerful extension of the C# language that allows integrated queries on collections of data and is often used for database queries together with the Entity Framework. Although LINQ offers many advantages, there are also some disadvantages to consider:

Performance:

In some situations, LINQ queries may result in less optimized SQL queries than manual SQL queries written by an experienced developer. This can lead to lower performance on complex or heavy queries.

Complexity of Understanding:

For less experienced developers, LINQ syntax can be more complex to understand than writing traditional SQL queries. Depending on the scenario, LINQ queries can increase the learning curve and make code maintenance difficult.

Expressivity Limitations:

Although LINQ is quite expressive, there may be cases where the LINQ syntax is not flexible enough to express complex or specific queries. In such situations, developers may need to resort to traditional SQL queries.

Difficulty in Debugging:

Debugging LINQ queries can be more challenging than debugging traditional C# code or SQL queries. Viewing and inspecting LINQ queries while debugging may not be as intuitive as examining SQL queries directly.

Memory Performance Limitations:

In some cases, especially when dealing with large in-memory data sets, using LINQ can result in excessive resource consumption and affect application performance.

Compatibility with Specific Databases:

Some specific database functionalities may not be fully supported by LINQ. In some cases, it may be necessary to resort to direct SQL queries.

Conclusion

It is important to highlight that, despite these disadvantages, LINQ offers many advantages, such as better code readability, greater productivity and the possibility of writing queries more securely. The choice between using direct LINQ and SQL queries will depend on the specific needs of the project and the preferences of the development team.

In this post, we learned the main LINQ query methods with simple and more complex examples. Despite the considerations above, always consider using LINQ to increase your productivity when working with a database.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

SQL Server–Export varbinary column to disk

1 Share

At one of my customers we store files in the database in varbinary(max) columns. To debug an issue we wanted to read out the files.

Of course we could create a small program to do this, but we wanted to do it directly from the SQL Server Management Studio.

We created a small database script that uses OLE Automation procedures:

To use this script for your own purposes, replace the query after the cursor creation:

DECLARE FILEPATH CURSOR FAST_FORWARD FOR <Add your own SELECT query here>

The first time that we executed this script, it failed with the following error messages:

Msg 15281, Level 16, State 1, Procedure sp_OACreate, Line 1 [Batch Start Line 0]

SQL Server blocked access to procedure 'sys.sp_OACreate' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', search for 'Ole Automation Procedures' in SQL Server Books Online.

Msg 15281, Level 16, State 1, Procedure sp_OASetProperty, Line 1 [Batch Start Line 0]

SQL Server blocked access to procedure 'sys.sp_OASetProperty' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', search for 'Ole Automation Procedures' in SQL Server Books Online.

Msg 15281, Level 16, State 1, Procedure sp_OAMethod, Line 1 [Batch Start Line 0]

SQL Server blocked access to procedure 'sys.sp_OAMethod' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', search for 'Ole Automation Procedures' in SQL Server Books Online.

SQL Server blocked access to procedure 'sys.sp_OADestroy' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', search for 'Ole Automation Procedures' in SQL Server Books Online.

The reason it failed because Ole Automation was disabled. To fix it, we had to enable this feature at the database configuration level. Right click on the database instance in SQL Server Management Studios and select Facets.

There select the Server Configuration section from the Facets dropdown and set the OleAutomationEnabled property to true:

 


Important: the files are written to the specified path on the database server(!).

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories