Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
132550 stories
·
29 followers

Why Every Engineering Student Should Be on GitHub!

1 Share

Hello future coders! Whether you're a freshman just stepping into the world of coding or a seasoned senior wrapping up degree projects, GitHub should be your go-to platform. Embrace it as your new digital playground where the possibilities are as vast as your imagination. On GitHub, you can store and manage your code, collaborate with peers, and contribute to open-source projects. Imagine creating something extraordinary, collaborating with friends across the globe—GitHub makes it possible!

GitHub: A Game-Changer for Students

For students, GitHub is not merely a tool; it's your passport to the tech universe. It enables you to track your coding journey, collaborate on group assignments, and build a professional portfolio showcasing your projects. Here’s why GitHub becomes indispensable:

    • Code Management & Version Control: Keep track of what has changed, who changed it, and why. Version control is vital in software development.
    • Collaboration: GitHub supports team projects, allowing multiple contributors to work on a project. You can also engage with the global community by joining open-source initiatives.
    • Learning and Feedback: Get exposed to industry-standard practices and receive constructive feedback from seasoned developers worldwide.
    • Portfolio Building: GitHub showcases your projects to potential employers, demonstrating your real-world coding skills and collaborative experiences.

Whether involved in a university project or pursuing a personal coding venture, GitHub ensures a streamlined and organized experience.

Unlocking the GitHub Student Developer Pack 

One exciting feature you shouldn't miss is the GitHub Student Developer Pack. This pack is a treasure trove of free tools and resources crafted specifically for students, elevating your academic and personal projects to the next level.

 

What’s Inside the Pack?

  • Software: Access industry-standard software for coding, design, and development.
  • Cloud Services: Leverage cloud platforms to test and deploy your applications.
  • Developer Tools: Tap into a variety of tools to enhance your coding, testing, and deployment skills.

How to Redeem the GitHub Student Developer Pack

  1. Sign Up: Head over to the GitHub Education page.
  2. Verify Your Student Status: Use your official student email to register and verify your academic status.
  3. Enjoy: Dive into a suite of powerful resources and tools that support both your learning and projects.

Utilizing these resources will not only sharpen your technical acumen but also set you ahead of the learning curve in the ever-evolving tech landscape.

Looking Ahead: GitHub Copilot

Stay tuned for our upcoming blog post, where we'll delve into GitHub Copilot. This AI-powered tool is set to become your coding companion, assisting you in writing better code, faster. GitHub Copilot is changing the future of coding, and we’ll guide you on how to make it part of your toolkit.

In conclusion, leveraging GitHub is not just about enhancing your coding skills but about building your future in technology. Whether it's connecting with a vast network of developers or refining your portfolio, GitHub positions you strategically in the ever-competitive tech domain.

So, get started on GitHub today, and unlock the next level of your coding journey!

Read the whole story
alvinashcraft
just a second ago
reply
West Grove, PA
Share this story
Delete

GitHub Copilot Bootcamp

1 Share

GitHub Copilot Bootcamp is a series of four live classes designed to teach you tips and best practices for using GitHub Copilot. Discover how to create quick solutions, automate repetitive tasks, and collaborate effectively on projects. REGISTER NOW!

Why participate?

GitHub Copilot is not just a code suggestion tool, but a programming partner that understands your needs and accelerates your work. By participating in the bootcamp, you will have the opportunity to:

  1. Master the creation of effective prompts.
  2. Learn to develop web applications using AI.
  3. Discover how to automate tests and generate documentation.
  4. Explore collaboration practices and automated deployment.

Agenda

Sessions are scheduled for 12pm PST, 3pm ET, 2pm CT, and 1pm MT.

đź“… February 4, 2025

Prompt Engineering with GitHub Copilot

Learn how GitHub Copilot works and master responsible AI to boost your productivity.

đź“… February 6, 2025

Building an AI Web Application with Python and Flask

Create amazing projects with AI integration and explore using GitHub Copilot to simplify tasks.

đź“… February 11, 2025

Productivity with GitHub Copilot: Docs and Unit Tests

Automate documentation and efficiently develop tests by applying concepts directly to real-world projects.

đź“… February 13, 2025

 Collaboration and Deploy with GitHub Copilot

Learn to create GitHub Actions, manage pull requests, and use GitHub Copilot for Azure for deployment.

Who can participate?

If you are a developer, student, or technology enthusiast, this bootcamp is for you. The classes are designed to cater to both beginners and experienced professionals.

How to apply?

Secure your spot now and start your journey to mastering GitHub Copilot!

👉 REGISTER NOW!

Read the whole story
alvinashcraft
just a second ago
reply
West Grove, PA
Share this story
Delete

Learn New Skills in the New Year

1 Share

New year’s resolution: Start writing better code faster in 2025.

Kick off the new year by learning new developer skills and elevate your career to the next level.

In this post, we explore learning resources and live events that will help you build critical skills and get started with cutting-edge technologies. Learn how to build custom agents, code intelligent apps with familiar tools, discover new possibilities in .NET 9, use Copilot for testing and debugging, and more. Plus, get details about using GitHub Copilot in Visual Studio Code—for free! 

 

New AI for Developers page
Check out the new AI for Developers page. It's packed with free GitHub courses on building apps, machine learning, and mastering GitHub Copilot for paired programming. Learn your way and skill up for what's next in AI.

Use GitHub Copilot in Visual Studio Code for free
Did you hear the news? You can now use GitHub Copilot in Visual Studio Code for free. Get details about the new Copilot Free plan and add Copilot to your developer toolbox.

What is Copilot Studio?
Have questions about Copilot Studio? This article from Microsoft Learn covers all the basics you need to know about Copilot Studio—the low-code tool for easily building agents and extending Microsoft 365 Copilot.

From C# to ChatGPT: Build Generative AI Solutions with Azure
Combine your C# skills with the cutting-edge power of ChatGPT and Azure OpenAI Service. This free learning path introduces you to building GenAI solutions, using REST APIs, SDKs, and Azure tools to create more intelligent applications.

Register for the Powerful Devs Conference + Hackathon
Register for the Powerful Devs Conference + Hackathon (February 12-28, 2025) and get more out of Power Platform. This one-day online conference is followed by a 2-week hackathon focused on building intelligent applications with less effort.

Code the future with Java and AI: RSVP for Microsoft JDConf 2025 today
Get ready for the JDConf 2025—Microsoft's annual event for Java developers. Taking place April 9-10, this year’s event will have three separate live streams to cover different regions. Join to explore tools and skills for building modern apps in the cloud and integrating AI.         

Build custom agents for Microsoft Teams
Learn how to build custom agents for Microsoft Teams. This free learning path will teach you about different copilot stacks, working with Azure OpenAI, building a custom engine agent. Start building intelligent Microsoft Teams apps using the LLMs and AI components.

Microsoft Learn: Debug your app with GitHub Copilot in Visual Studio
Debug more efficiently using GitHub Copilot. This Microsoft Learn article shows you how. Discover how Copilot will answer detailed questions about your code and provide bug fixes.  

Make Azure AI Real: Watch Season 2
Elevate your AI game with Make Azure AI Real on demand. Season 2 digs into the latest Azure AI advancements, with practical demos, code samples, and real-world use cases.

GitHub Copilot Bootcamp
Streamline your workflow with GitHub Copilot—craft more effective prompts and automate repetitive tasks like testing. This GitHub Copilot Bootcamp is a 4-part live streaming series that will help you master GitHub Copilot.

10 Days of GenAI – Gift Guide Edition
Start building your own Gen AI application. These short videos outline 10 steps for creating your app—choose a model, add functions, fine tune responses, and more.

Extend Microsoft 365 Copilot with declarative agents using Visual Studio Code
Check out this new learning path from Microsoft Learn to discover how you can extend Microsoft 365 Copilot with declarative agents using VS Code. Learn about declarative agents and how they work.   

Developer's guide to building your own agents
Want to build your own agents? Watch this Ignite session on demand for a look at the new agent development tools. Find out how to create agents built on Microsoft 365 Copilot or your custom AI engine.

Master distributed application development with .NET Aspire
Get started with .NET Aspire—an opinionated, cloud-ready stack for building distributed applications with .NET. This series covers everything from setup to deployment. Start your journey toward mastering distributed app development.

Learn: What's new in .NET 9
Discover what's new in .NET 9. Learn about new features for AI, improvements for building cloud-native apps, performance enhancements, updates to C#, and more. Read the overview and get started with .NET 9.

Become a .NET AI engineer using the OpenAI library for .NET
Use your .NET skills to become an AI engineer. With the OpenAI library, .NET developers can quickly master critical AI skills and apply them to real world apps. Read the blog to learn more about the OpenAI library for .NET.

Test like a pro with Playwright and GitHub Copilot
Supercharge your testing using Playwright and GitHub Copilot. Watch this in-depth demo and discover how you can easily create end-to-end tests using Playwright's powerful built-in code generator.

 

Read the whole story
alvinashcraft
29 seconds ago
reply
West Grove, PA
Share this story
Delete

Toddle.dev: The Future of Visual Web Application Development

1 Share
A look at how Toddle.dev helps designers and developers work together by using a visual tool to build web applications more easily.
Read the whole story
alvinashcraft
51 seconds ago
reply
West Grove, PA
Share this story
Delete

Rider 2025.1 Roadmap

1 Share

The start of the new year is the perfect time to share our plans for JetBrains Rider 2025.1. These plans are subject to change based on available resources, evolving development priorities, and shifts in the .NET landscape. Some features and improvements may be postponed to a later release date.

With that in mind, let’s dive into what we have in store for you!

Pain-free performance profiling

As .NET development continues its shift to the cloud, we’re seeing a fundamental change in how performance impacts our applications. Cloud platforms give incredible flexibility and scalability, but they also introduce new challenges. With pay-as-you-go pricing models, performance isn’t just about the user experience anymore – it directly affects your bottom line.

This is especially critical in microservice architectures, where a performance bottleneck in one service can cascade throughout your entire system. The same principle applies to game development, where performance issues can directly impact the player experience and where tracking frame times and memory allocations become crucial for optimization.

In user interviews for JetBrains Rider, we keep hearing from developers that, while they understand the importance of profiling, they often consider it a complex task that’s separate from their regular development workflow. For this reason, we’ve decided to rethink how profiling works in Rider. 

Our current suite of tools – Dynamic Program Analysis, the Monitoring tool, and our integrated dotTrace and dotMemory profilers – already provides robust performance insights. What we believe we should do now is make these tools more accessible and intuitive.

Our goal is simple: make performance monitoring and profiling feel like a natural extension of your development process, rather than a separate specialized task. We’ll be streamlining the interface, reducing configuration, and integrating profiling more deeply into your daily workflow. Whether you’re developing cloud services, creating games, or building traditional applications, Rider’s profiling tools will be there to help you optimize your code without getting in your way.

We’ll be sharing more specific details about these improvements in the coming months.

Debugging

Mixed mode debugging

For Rider 2025.1, we’re setting out to implement mixed mode debugging – a capability that lets you debug both .NET and C/C++ code in a single session. This is one of our most requested features, particularly from developers working with game engines like Unity and Unreal, as well as those building desktop applications that use native Windows APIs through P/Invoke.

Currently, debugging code that crosses the managed/native boundary requires juggling a couple of different tools. Our goal is to change that. When implemented, you’ll be able to:

  • Step seamlessly between .NET and native code.
  • Inspect variables across both environments.
  • Set breakpoints anywhere in your codebase.
  • Debug interop scenarios without switching contexts.

We believe this addition will significantly improve the debugging experience for developers working with hybrid applications. While we can’t commit to a specific release timeline yet, we’re excited to begin this important work and will share our progress along the way.

C++ code debugging

We’ve been listening carefully to your feedback about C++ debugging in Rider, and in 2025, we’re embarking on a comprehensive improvement of the entire debugging experience.

Performance is at the heart of these changes. We’re optimizing PDB reading to get you into debugging sessions faster, and we’re also reworking collection evaluation to make it even more efficient. We know how important conditional breakpoints are for complex debugging scenarios, so we’re implementing a faster system for handling those. We’ll be improving data breakpoints in C++ too, making them more reliable and responsive.

But raw performance isn’t everything. We’re also focusing on making debugging feel more natural and controlled. New stepping filters will help you navigate your codebase with precision, letting you focus on just the right bits of code. We’re also improving how Rider handles debugger detachment, and we’re implementing a series of smaller but meaningful enhancements that, together, will make C++ debugging a much smoother experience.

Improved data visualizers 

For Rider 2025.1, we’re focusing on improving data visualization for LINQ expressions, making it easier for developers to understand and debug complex LINQ queries directly in the debugger. You’ll be able to inspect query execution, see intermediate results, and better understand how your LINQ operations transform the data. Check out this issue on our tracker to follow the progress and report your experience.

Remote development on Windows

We’re working on ensuring Rider 2025.1 will support Windows as a host machine for remote development. This step will complete our remote development journey, allowing Windows users to connect to remote development machines while keeping their local Rider instance responsive and familiar. Just like with our existing macOS and Linux remote development support, you’ll be able to run, debug, and test your applications remotely while maintaining the performance and experience of a local IDE.

Support for SQL Projects

In 2025, we’re modernizing Rider’s SQL Server support by integrating the SQL Tools API – the same technology that’s employed by Visual Studio Code and Azure Data Studio. This shift brings enterprise-grade database development capabilities directly into Rider.

The upgrade will allow for seamless management of SQL Server projects within your solutions, direct database publishing, and the use of robust schema comparison tools the IDE already offers. With the new cross-platform SDK support, you’ll be able to build and manage SQL projects across different operating systems, making Rider a more versatile tool for cloud and hybrid environments.

Enhanced Roslyn support

For the upcoming release of Rider, we’re working on adding support for Roslyn-based suppressors, which will let you fine-tune how compiler diagnostics work in your projects. If you’ve ever wanted more granular control over warnings and code analysis, this improvement is for you.

For developers working with code analysis tools, we’re also going to introduce a Roslyn Syntax Visualizer. This tool is designed to aid your understanding of exactly how Roslyn sees your code’s structure, making it significantly easier to develop custom analyzers and refactorings. Think of it as a diagnostic tool that lets you peek under the hood of your C# code. 

The road ahead

While some features, like mixed mode debugging, are still in the early stages, others are already taking shape. We’ll be sharing more details and progress updates in the coming months, so make sure you’re subscribed for our blog’s newsletter. 

The Early Access Program is also right around the corner. Downloading and using the EAP builds is the quickest way to get a feel for our latest improvements. More on that in this blog post.   

We hope this post was insightful and you’re just as excited for the upcoming changes as we are! As always, we welcome your feedback in the comments below or on our issue tracker

Read the whole story
alvinashcraft
1 minute ago
reply
West Grove, PA
Share this story
Delete

C# 12.0: collection expressions

1 Share

C# 12.0 adds a new syntax for initializing the contents of expressions. This was an interesting move, given that we already had several ways of doing this. Here are three slightly different but equivalent ways C# 11.0 lets us initialize an array of strings:

var verbose = new string[] { "Horses", "Mutton", "Miles" };
var shorter = new[] { "Horses", "Mutton", "Miles" };
string[] perfect = { "Horses", "Mutton", "Miles" };

(Naturally, avoiding var leads to superior results.)

And here's the new syntax:

string[] wasThatReallyNecessary = ["Horses", "Mutton", "Miles"];

It doesn't work with var today, by the way. If you're familiar with my writing on C# you'll be aware that I'm happy with that, but it's slightly surprising given that the C# team mostly seems to have a pro-var bias. (E.g., although Roslyn has analyzer that can be used to enforce a preference for not using var, it appears to have been written by someone who never understood why someone might have such a preference, and as such it makes some very strange decisions.)

The new syntax does add new functionality, as you'll see later when I talk about spreads, but it also addresses some problems with the existing syntax.

What was wrong with the old syntax?

The old collection initializer syntax—the one with braces—worked not just with arrays but also collection classes such as List<T> and even Dictionary<K,V>. You can use it with any type that implements IEnumerable and which defines an Add method. If the Add method takes multiple arguments, as Dictionary<K,V>.Add does, we can use a nested syntax:

Dictionary<string, string> d =
{
    { "A", "Horses" }, { "B", "Mutton" }, { "C", "Miles" }
};

So it's flexible and widely supported. What's not to like?

Performance

One subtle problem is that there was no way for a type to provide direct support for initializing a new instance from a list of data. This IEnumerable + Add pattern is the only way a type can support this older syntax, and that has performance consequences. Look at this performance test:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<InitializerBencharks>();

[MemoryDiagnoser]
public class InitializerBencharks
{
    [Benchmark(Baseline = true)]
    public List<int> ListInitializer()
    {
        return new() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
    }

    [Benchmark]
    public List<int> ListCollectionExpression()
    {
        return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    }
}

You might expect to see no difference here, because both do the same very simple thing: each creates a List<int> containing the numbers 1 through 10. What on earth could a collection expression do that could make a difference here?

Quite a lot, judging by the benchmark results:

Method Mean Error StdDev Ratio RatioSD Gen0 Allocated Alloc Ratio
ListInitializer 89.51 ns 3.546 ns 10.400 ns 1.01 0.16 0.0516 216 B 1.00
ListCollectionExpression 24.81 ns 0.964 ns 2.782 ns 0.28 0.04 0.0229 96 B 0.44

The collection expression worked over 3x faster, and used less than half as much memory!

The problem with the IEnumerable + Add pattern is that the List<T> has no way of knowing that it's about to be initialized with a fixed-size list, so it just picks its standard default initial size (4 entries in .NET 8.0), and then doubles its size each time it runs out of space. So the first benchmark allocates a 4-entry array, then an 8-entry array, and then a 16-entry array.

We can tell the list to allocate a different initial capacity by adding a constructor argument:

[Benchmark]
public List<int> ListInitializerPreallocate()
{
    return new(10) { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
}

But this is non-obvious, and also fragile: if we happen to get the number wrong, or if something changes meaning that a once-correct number is no longer right (because we modified the list), then it won't be obvious that anything is wrong. (It's not an error to exceed the initial capacity—List<T> just allocates a larger array. This works, but it means our attempt to optimize performance fails.) You'd only discover the problem if you have tests that automatically detect performance regressions.

And interestingly, even this 'fixed' initializer version isn't as fast as the collection expression code:

Method Mean Error StdDev Median Ratio RatioSD Gen0 Allocated Alloc Ratio
ListInitializer 83.36 ns 2.690 ns 7.933 ns 80.76 ns 1.01 0.13 0.0516 216 B 1.00
ListInitializerPreallocate 32.06 ns 0.827 ns 2.427 ns 31.94 ns 0.39 0.05 0.0229 96 B 0.44
ListCollectionExpression 20.62 ns 0.498 ns 1.388 ns 20.39 ns 0.25 0.03 0.0229 96 B 0.44

The memory usage is now the same, and this pre-allocating version is over twice as fast as the naive initialization expression, but the collection expression is significantly faster still.

How's it doing that?

Collection expressions define a way in which types can provide an optimized initialization mechanism. In .NET 8.0, List<T> does this, and you can see the benefit. I'll explain the details later, but here's the essential difference. With the old collection initializer syntax, the compiler effectively generates code that calls l.Add(1); l.Add(2); l.Add(3); ... but with the collection expression, it's able to supply all of the numbers as a Span<int> in a single call. The values can be copied into the List<int> with a single efficient loop instead of having to deal with Add after Add.

Immutable collections and other awkward types

The immutable collection types were always a bit of a pain to initialize from a list of literal values. The old collection initializer syntax requires a type that you can construct and then populate by calling Add repeatedly. ImmutableList<T> thwarts us twice here. First, it has no public constructor. Second, even if it did have one, calling Add repeatedly (as collection initializers do) doesn't actually work:

ImmutableList<int> list = ImmutableList<int>.Empty;
list.Add(1);
list.Add(2);

// Displays 0!
Console.WriteLine(list.Count);

The old initializer syntax presumes that Add changes the object on which you call it, but of course the whole point of immutable collections is that once created, any particular instance never changes. So these Add methods leave the collection unmodified and return a new collection with the new member. This is the way to use Add with an immutable list:

ImmutableList<int> list = ImmutableList<int>.Empty;
list = list.Add(1);
list = list.Add(2);

But this illustrates why it's really just as well that you can't use the old collection initializer syntax with immutable collections: if our goal is just to end up with the populated list, we're causing a lot of unnecessary work here because each call to Add creates a fully-functional immutable list. If we don't really need all those intermediate lists, there are more efficient ways. If we've already got the values we want in some existing IEnumerable<T>, we can call AddRange. Or, there's an ImmutableList<T>.Builder type that lets us call Add many times over without paying the full cost of creating a whole new ImmutableList<T> after every single call. With the builder type, we call ToImmutable after we've made all our Add calls, so that it only does the work of building a full new ImmutableList<T> once. There's also a helper that converts any IEnumerable<T> to an ImmutableList<T>.

If anything, there are too many options, because it's not at all clear which of these would be best. So let's try them all:

[MemoryDiagnoser]
public class ImmutableInitializerBencharks
{
    [Benchmark(Baseline = true)]
    public ImmutableList<int> Adds()
    {
        return ImmutableList<int>.Empty
            .Add(1).Add(2).Add(3).Add(4).Add(5).Add(6).Add(7).Add(8).Add(9).Add(10);
    }

    [Benchmark]
    public ImmutableList<int> AddRange()
    {
        return ImmutableList<int>.Empty.AddRange(new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 });
    }

    [Benchmark]
    public ImmutableList<int> FromEnumerable()
    {
        return new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }.ToImmutableList();
    }

    [Benchmark]
    public ImmutableList<int> Builder()
    {
        ImmutableList<int>.Builder b = ImmutableList.CreateBuilder<int>();
        b.Add(1); b.Add(2); b.Add(3); b.Add(4); b.Add(5); b.Add(6); b.Add(7); b.Add(8); b.Add(9); b.Add(10);
        return b.ToImmutable();
    }

    [Benchmark]
    public ImmutableList<int> CollectionExpression()
    {
        return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    }
}

If you're looking for succinctness, there's no doubt that the new C# 12.0 collection expression syntax (the last benchmark in that code) is the winner. But what about performance?

Method Mean Error StdDev Ratio RatioSD Gen0 Allocated Alloc Ratio
Adds 740.5 ns 18.24 ns 53.77 ns 1.01 0.10 0.4587 1920 B 1.00
AddRange 218.8 ns 4.95 ns 14.58 ns 0.30 0.03 0.1414 592 B 0.31
FromEnumerable 215.0 ns 4.38 ns 11.31 ns 0.29 0.03 0.1414 592 B 0.31
Builder 383.1 ns 7.73 ns 22.30 ns 0.52 0.05 0.1316 552 B 0.29
CollectionExpression 172.9 ns 4.16 ns 12.27 ns 0.23 0.02 0.1204 504 B 0.26

No matter which approach we choose, we're in a world of pain compared to a normal List<T>. Immutable lists have some desirable advantages over more conventional collections, but those don't include speed or memory efficiency. But assuming the benefits are worth it, how do our options compare?

Clearly, repeatedly calling Add directly on the ImmutableList<T> is by far the worst option. All the alternatives here wait until we know exactly what's going into the final list before creating it, and you can see that this uses under a third of the memory. (It nearly quarters the memory usage in the best case.) These are also 2-4 times faster than the naive Add, Add, Add approach. So it's good that the old initializer syntax doesn't work at all here, because that is the only thing it knows how to do, which turns out to be terrible in this case.

But notice how the new collection expression syntax is the best option by a large margin here.

This is a good illustration of what this new language feature was trying to achieve. Not only were the various options for initializing an ImmutableList<T> pretty clunky, it wasn't at all obvious which was the best. But now, we have a much simpler option for immutable collections, and it's also the fastest option.

(If you are using low-level high performance mechanisms such as Span<T> and stackalloc it's not always as clear cut. The optimal choice of list initialization mechanism can depend on contextual knowledge not available to the compiler, so in those cases, collection expressions sometimes don't produce the fastest possible code. But in the majority of code, collection expressions perform best.)

Opting in to collection initialization

How are the collection expression benchmarks able to run faster than the alternatives? This new language feature doesn't have magical capabilities—the C# compiler ultimately has to generate IL that the .NET runtime can execute, so we should be able to write code that gets the same performance without the new syntax.

Here's one way to do that with List<T>:

private static ReadOnlySpan<int> _data => new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

[Benchmark]
public List<int> ManuallyOptimized()
{
    List<int> l = new(10);
    l.AddRange(_data);
    return l;
}

That is significantly faster than multiple calls to Add simply because the List<T> can handle the change in a single step. (When you write a property with type ReadOnlySpan<int> using array syntax in this way, it doesn't really create an array. It compiles the binary representation of those numbers into a region of the DLL, and generates code that just gets a pointer to the memory containing that part of the DLL at runtime.)

Before I wrote this post, I was under the impression that this was roughly what the compiler would produce for a collection expression, but apparently not. That turns out to look more like this:

[Benchmark]
public List<int> ManuallyOptimized2()
{
    List<int> l = new(10);
    CollectionsMarshal.SetCount(l, 10);
    Span<int> ints = CollectionsMarshal.AsSpan(l);
    ints[0] = 1; ints[1] = 2; ints[2] = 3; ints[3] = 4; ints[4] = 5;
    ints[5] = 6; ints[6] = 7; ints[7] = 8; ints[8] = 9; ints[9] = 10;
    return l;
}

This is basically performing open brain surgery on the List<int>. It's saying "Act as though you've already been given 10 elements, and now give me direct access to your internal storage so I can put those elements in there directly".

These two manual benchmarks perform more or less identically, so why does C# generate the second, more complex-looking one? It results in a lot more IL, making the compiled program take up more space, so even if the runtime performance is identical, there's a small downside to this second approach. Why does it do it?

The big advantage of this technique is that it works equally well for dynamic contents. E.g.:

public List<int> MakeListWith(int item)
{
    return [1, 2, item, 3, 4];
}

The ReadOnlySpan<T>/AddRange technique I showed in my first manually optimized version works only if everything in the list is a constant. It also works only for certain data types—it works by embedding the raw data for the array directly into the binary of the component, and the ReadOnlySpan<int> is just a pointer to that block of memory. But it can't embed values as raw binary emitted directly into the compiled code if any of the values are going to be determined at runtime. So that's one reason the compiler uses the approach shown in ManuallyOptimized2

Also, the technique of embedding the values as a block of binary doesn't work for string, because a string has to live on the GC heap. But this CollectionsMarshal technique will work equally well for all types. So that's another reason the compiler has for preferring this. (Of course, it could use the embedded binary approach in the cases where it is viable, but since there's no measurable performance benefit, the compiler might as well use a single, consistent approach that works for all cases.)

Those CollectionsMarshal methods are pretty specialized. Evidently the compiler knows some things about List<T> and is using that special knowledge to generate this code. We can't use that trick for our own types. Fortunately, you don't need to submit a change to the C# compiler repo just to get efficient initialization for your own types. There's a more general mechanism, and it's what the immutable collections use.

The following code is effectively identical to what the compiler produced for the immutable collections benchmark in which I used a collection expression:

private static ReadOnlySpan<int> _data => new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

[Benchmark]
public ImmutableList<int> ManuallyOptimized()
{
    return ImmutableList.Create(_data);
}

The compiler knew to use this Create method because the ImmutableList<T> class is annotated with a particular attribute:

[CollectionBuilder(typeof(ImmutableList), nameof(ImmutableList.Create))]
...
public sealed partial class ImmutableList<T> : ...

The CollectionBuilder attribute tells the C# compiler that this type provides support for efficient initialization, which is why the collection expression code is compiled as a call to this method.

But wait a second. I just explained that there are good reasons the compiler doesn't do this for List<T>: this approach doesn't support lists with values determined at runtime. What does the compiler do if we ask for that?

private int counter = 0;

[Benchmark]
public ImmutableList<int> CollectionExpressionNonConstant()
{
    return [counter++, 2, 3, 4, 5, 6, 7, 8, 9, 10];
}

It produces something like this:

[InlineArray(10)]
private struct TenInts
{
    private int element;
}

[Benchmark]
public ImmutableList<int> ManualNonConstant()
{
    TenInts ints = default();
    ints[0] = counter++;
    ints[1] = 2; ints[2] = 3; ints[3] = 4; ints[4] = 5;
    ints[5] = 6; ints[6] = 7; ints[7] = 8; ints[8] = 9; ints[9] = 10;
    ReadOnlySpan<int> intsAsSpan = ints;
    return ImmutableList.Create(intsAsSpan);
}

It defines an inline array type (something I described in a recent blog). This particular one, TenInts, is a value type that holds exactly 10 int values, and by declaring this as a local, there's no need for it to live on the heap. (In the code the compiler generates for both CollectionExpressionNonConstant and ManualNonConstant, this 10-element inline array lives on the stack.) It then passes this to the method that ImmutableList<T> identified with its CollectionBuilder attribute.

This avoids any allocation beyond the minimum that ImmutableList has to do in any case.

Why doesn't the compiler just use this mechanism for List<T>? It's because it can exploit specific knowledge about what List<T> is going to have to do. List<T> will definitely allocate some contiguous block of memory to hold its elements. (It's possible to imagine an implementation that doesn't. But List<T> is part of the .NET runtime, so Microsoft can just decide to constrain its implementation in this way.) So there's no need to assemble all the elements in a separate chunk of memory and then copy them in: we can just ask List<T> to give us a span pointing to the memory it has allocated and we can build the list directly in place. But in general, collection types won't necessarily work that way, in which case this more general CollectionBuilder mechanism is the most efficient approach.

Spread elements

So far, all the examples I've shown have created collections whose size is determined at compile time. Collection expressions add a capability not available with the old initializer syntax: it's possible for the number of items in the initializer to vary at runtime. Here's an example:

public static int[] Bookend(int[] contents) =>
    [int.MinValue, .. contents, int.MaxValue];

That .. denotes a spread element. It indicates that this is not meant to be a single element in the list: instead we expect contents to be a collection, and we want to include all of its contents at this point in the new collection. For example calling Bookend([1, 2, 3]) would produce the same list as [int.MinValue, 1, 2, 3, int.MaxValue].

This feature surprised me once I started using it. Before using it, I had understood the point of it, but in practical usage I've liked it more than I was expecting to. That might be because I've been doing a lot of work with source generators recently, and that's an area where you do a lot of stitching together of lists. But I've found the resulting code to be pleasingly expressive.

Spreads cause heap allocations

Because spreads can mean the compiler can't know how many elements some collection expressions will produce, it can't generate a fixed-size inline array type to handle dynamic list construction. In that Bookend example, it knows it's going to have to produce an array, so it just generates code that is roughly like this:

public static int[] BookendManual(int[] contents)
{
    int[] result = new int[contents.Length + 2];
    result[0] = int.MinValue;
    for (int i = 0; i < contents.Length; ++i) { result[i + 1] = contents[i]; }
    result[^1] = int.MaxValue;
    return result;
}

That's fine: it was going to have to allocate the array anyway because our return type is int[], so there isn't a more efficient way to do this. But what about immutable lists?

static ImmutableList<int> BookendImmutable(int[] contents) =>
    [int.MinValue, .. contents, int.MaxValue];

This turns out to do more or less exactly what my BookendManual does: it also allocates an int[]. (The only difference is that it returns ImmutableList.Create(result) to turn that into an immutable list.) So we now end up with two versions of our list on the GC heap: the final ImmutableList<int> that we wanted, but also the staged list contents in an array.

The previous section showed the compiler generating code that staged the contents of a collection on the stack. But it only does that if it can determine the final collection size at runtime. (At least, that's true in today's compiler. Details of code generation are subject to change here, because the language feature was designed to allow some implementation latitude.)

In principle the compiler could have generated code like this:

static ImmutableList<int> BookendImmutableManualStackalloc(int[] contents)
{
    Span<int> x = stackalloc int[contents.Length + 2]; // DANGER!
    x[0] = int.MinValue;
    for (int i = 0; i < contents.Length; ++i) { x[i + 1] = contents[i]; }
    x[^1] = int.MaxValue;
    ReadOnlySpan<int> items = x;
    return ImmutableList.Create(items);
}

The reason it doesn't is that if contents is large, this is likely to cause a stack overflow.

The slightly frustrating thing about this is that if you can be certain that constraints in your application guarantee that contents will never be more than, say, 20 elements, you can't get the compiler to emit this more efficient code. When manually writing memory-efficient code, we often have a low-allocation code path that runs when things will fit easily on the stack, allocating only when they won't, e.g.:

static ImmutableList<int> BookendImmutableManualStackalloc(int[] contents)
{
    // Use stack unless contents array is large.
    Span<int> x = contents.Length < 256
        ? stackalloc int[contents.Length + 2]
        : new int[contents.Length + 2];
    x[0] = int.MinValue;
    for (int i = 0; i < contents.Length; ++i) { x[i + 1] = contents[i]; }
    x[^1] = int.MaxValue;
    ReadOnlySpan<int> items = x;
    return ImmutableList.Create(items);
}

It's slightly disappointing that there doesn't seem to be a way to use the collection syntax here. I'd really like to write something like this:

static ImmutableList<int> BookendImmutableManualStackalloc(int[] contents)
{
    // Use stack unless contents is large.
    Span<int> x = contents.Length < 256
        ? stackalloc int[contents.Length + 2]
        : new int[contents.Length + 2];
    x = [int.MinValue, .. contents, int.MaxValue];
    ReadOnlySpan<int> items = x;
    return ImmutableList.Create(items);
}

so that I don't have to implement that spread with my own loop. But this doesn't seem to be possible.

So it seems that once you get into the (somewhat rareified) realm of code that makes decisions about whether dynamically-sized collections can safely use stackalloc, collection expressions are of no use to you. Perhaps that its for the best: if you're writing code that needs to make these sorts of decisions, it's arguably a good thing for the behaviour to be explicit.

To be fair, this most people don't write that sort of code. And if you're not writing that sort of code, collection expressions will be as efficient as it's safe to be.

Why doesn't it work with var?

The C# team couldn't decide on the natural type for these expressions. What type do you think xs should be here?

var xs = [1, 2, 3];

(Incidentally, this illustrates in a nutshell exactly what I dislike about var. It forces you to stop and think about what the type of a variable actually is. In this case, that's a sufficiently hard question to answer that the C# team has been trying to work out what it should be for several years now. Code that makes you stop and think for several years is not good for productivity.)

Perhaps you think it should obviously be int[]? But that might be inefficient, so maybe it should obviously be ReadOnlySpan<int>. That raises the question: should code that does this expect to be able to modify the list, in which case, this code had better create a new instance every time it executes. But if you actually didn't need that, it would be far more efficient for this to return the same thing every time.

It's really not obvious what the type should be here, which is why the compiler will reject this code. (It's possible the compiler team will eventually decide on a natural type for collection expressions, but they did not do so in C# 12.0.)

Type generation

You've seen that the compiler will sometimes generate an inline array type as part of its code generation for collection expressions. That's not the only kind of type that it might generate. If you write this:

IEnumerable<int> xs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

the compiler generates an IEnumerable<int> implementation for you. That might be rather surprising. Why doesn't this code just create an array? The clue is in the name of the generated type, which will be something similar to <>z__ReadOnlyArray<T>. It's a read-only implementation: if you pass this to some method that casts to IList<int>, that cast will succeed, but attempts to modify the list will throw an exception.

This means that this IEnumerable<int> will always return the elements you provided in the collection expression—it can't be modified in a way that causes it to return different values. Maybe the compiler team did that to preserve the principle of least astonishment. Also, it makes it possible for this instance to be reused: since collections built this way are immutable, you can safely hand the same instance out to multiple times, and even use them concurrently from multiple threads.

One upshot of this is that using a collection expression to initialize an IEnumerable<int> is less efficient than using one to initialize an array. This code:

int[] xs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

runs faster (about 39%) and allocates less memory (about 27%) than the preceding example.

Doesn't that contradict what I said earlier about collection expressions always choosing the most efficient mechanism? That depends on your point of view. If you want and expect an immutable collection in the first example, then that does actually require more work than simply allocating an array. So from that perspective, these two examples are asking for different things, and therefore have different costs. However, you could also argue that it's not obvious that I'm asking for an immutable collection in the first example.

This is really just a variation of the problem that leads to the fact that you can't use var: it's hard to decide what the natural type of a collection should be. Some people might expect immutability with IEnumerable<int>, so the compiler has been cautious and always delivers that. If the performance difference between that and an array matters to you, it's easy enough to tell the compiler what type you want.

This has some significance for a new feature in C# 13.0 by the way. Just two and a half decades after the params keyword was introduced (in C# 1.0), we finally got a feature people have been asking for ever since: the ability to use it with other types. For example, you can now write this in C# 13.0:

public static void LotsOfArgs(params IEnumerable<int> numbers) { ... }

When you invoke a method with a params argument, the compiler uses the same logic as it would do with a collection initializer. In effect, it turns this:

LotsOfArgs(1, 2, 3);

into this:

LotsOfArgs([1, 2, 3]);

So that means it will create one of these immutable IEnumerable<int> implementations here, making this less efficient than params int[]. I don't think this is a huge deal because IEnumerable<int> is typically going to result in less efficient code anyway: a foreach over an array can avoid allocating an enumerator but it can't avoid that when using IEnumerable<T>. Performance sensitive code can use params ReadOnlySpan<int> in C# 13.0, which enables more efficient code than either IEnumerable<T> or an array.

No support for dictionaries yet

The older C# collection initializer syntax supports types whose Add method takes multiple arguments, enabling you to initialize dictionaries thus:

new Dictionary<string, string> { {"A", "Horses" }, { "B", "Mutton" } };

Alternatively, you could use the object initializer syntax (which uses the indexer rather than Add, but would have the same outcome in this particular example):

new Dictionary<string, string> { ["A"] = "Horses", ["B"] = "Mutton" };

Surprisingly, the new collection expression syntax has almost no support for dictionary-like collections. I say almost, because there's exactly one way you can use it:

Dictionary<string, string> d = [];

I find it slightly annoying that the C# code analyzers recommend this as the way to initialize any dictionary to an empty state. The reason I don't like it is that for anything other than completely empty collections, this collection expression syntax currently only supports creation of lists, not dictionaries. So it seems positively disingenuous to use it in this way. When I see [] I think "that's no dictionary."

The C# team has discussed possibly adding key/value syntax in future language versions, at which point that use of [] for an empty dictionary will no longer offend me. So I can see there's some logic here. I'd just prefer not to be instructed to use it by default here today.

Summary

Collection expressions might seem to provide a relatively small improvement for the scenarios that were already covered by initialization expressions, but they have some significant advantages. In many scenarios they perform much better than the older syntax. They also enable initialization of some types for which the older syntax simply doesn't work. They also support an entirely new features: spread elements enable us to include not just individual elements in the collection expression, but also expressions that represent lists to be injected into the final result. This syntax does not (as of C# 13.0) support initialization of dictionaries, but future versions might address this. And in all the scenarios where we can use either the old or new syntax, collection expressions always perform at least as well as the old approach, and will very often be more efficient.

Read the whole story
alvinashcraft
1 minute ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories