Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
127779 stories
·
29 followers

Disney To Stop Using Salesforce-Owned Slack After Hack Exposed Company Data

1 Share
Disney plans to transition away from using Slack as its companywide collaboration tool after a hacking group leaked over a terabyte of data from the platform. Many teams at Disney have already begun moving to other enterprise-wide tools, with the full transition expected later this year. Reuters reports: Hacking group NullBulge had published data from thousands of Slack channels at the entertainment giant, including computer code and details about unreleased projects, the Journal reported in July. The data spans more than 44 million messages from Disney's Slack workplace communications tool, WSJ reported earlier this month. The company had said in August it was investigating an unauthorized release of over a terabyte of data from one of its communication systems.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
24 minutes ago
reply
West Grove, PA
Share this story
Delete

Pulumi Google Cloud Provider Version 8.0.0

1 Share

The latest major release of the Pulumi Google Cloud Provider is available now! Our 8.0 release contains the latest upstream changes to keep you up-to-date along with the latest features and improvements from Pulumi.

The Pulumi Google Cloud provider can be used to provision any of the Google Cloud resources available in the upstream provider. The provider is open source and available on GitHub so you can always follow along with current issues and developments, or even open your first pull request.

Here are a few links to help you get started if you are new to Pulumi:

  • Getting Started - A guided walkthrough for creating your first project
  • Setup & Install - Instructions on installing the Google Cloud provider
  • How-to guides - Learn how to use the Google Cloud provider to provision specific resources
  • Templates - Use a quickstart template to create a new project
  • Pulumi AI - Ask Pulumi AI to create a new project

Looking Back

Since the last major release of this provider, we have continuously shipped improvements to our ecosystem, bringing the latest Pulumi features to your production stack. We have rolled out an improved diffing strategy and fixed state upgrades in the Pulumi Terraform Bridge, removing spurious diffs on preview and increasing confidence when deploying. Additionally we have improved accuracy and coverage for registry documentation, via better example conversion and general docs improvements for bridged providers.

New Modules

Over the last year, we have added support for several new modules. Among these are:

API Coverage Growth

The below chart shows the growth of this provider by resource, function, and supporting types since Version 7.

google-cloud-coverage

What’s New in 8.0

Added Deletion Protections

Several resources have new deletionProtection fields:

  • gcp.cloudrunv2.Service
  • gcp.cloudrunv2.Job
  • gcp.activedirectory.Domain
  • gcp.organizations.Folder
  • gcp.organizations.Project

Find the documentation for any of these resources in our registry

New default provisioning label

A new default label, goog-pulumi-provisioned lets you discover your Pulumi-provisioned resources in the GCP console to help you track resources and how they were created. This label is available as an Output only and can be disabled in your provider configuration.

Upgrading

You can find our v7 -> v8 Migration Guide on the Pulumi Registry.

Read the whole story
alvinashcraft
25 minutes ago
reply
West Grove, PA
Share this story
Delete

Daily Reading List – September 19, 2024 (#401)

1 Share

It’s been a long week! I spent some time this afternoon setting up a new work laptop, and I’m going to foolishly bring it on a trip tomorrow. What could go wrong?

[article] Valkey 8.0 rides high at Open Source Summit in Vienna. It’ll be interesting to see where this fork starts to distinguish itself and diverge from Redis functionality.

[blog] 9 new features we announced at Made on YouTube 2024. I can’t say that any of these apply to me directly, but it’s a guarantee I’ll interact with the results by YouTube creators.

[blog] Keys to a resilient Open Source future. Is AI going to be the best option for open source security? It might be, given the scale of code we’re talking and the volunteer-heavy approach.

[blog] Introducing Netflix’s Key-Value Data Abstraction Layer. Abstractions are tricky to maintain, and can accidentally block you from using unique features underneath. But for scenarios like this, the use case makes sense to me.

[article] Deno 2 Arrives With Long-Term Support, npm Compatibility. Migrating to this Nodejs replacement will be easier now, for those interested.

[blog] Quitting Time. Perseverance is important, but so is knowing when to quit. What’s your criteria, and can you stick to it?

[blog] Apache Airflow ETL in Google Cloud. The spectrum of hosting options is typically raw compute, managed compute, and managed servcies. That applies here as well.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:



Read the whole story
alvinashcraft
26 minutes ago
reply
West Grove, PA
Share this story
Delete

Try out OpenAI o1 in GitHub Copilot and Models

1 Share

Starting today, we’re opening a preview to give developers an opportunity to test OpenAI o1-preview and o1-mini, hosted on Azure, in both GitHub Copilot and Models. Sign up to get access to use OpenAI o1 in GitHub Copilot Chat with Visual Studio Code and in the playground with GitHub Models.

OpenAI o1 is a new series of AI models equipped with advanced reasoning capabilities, trained to think through complex tasks using an internal thought process. During our exploration of using o1-preview with GitHub Copilot, we found the model’s reasoning capability allows for a deeper understanding of code constraints and edge cases produced a more efficient and higher quality result. And o1-preview’s deliberate and purposeful responses made it easy to pinpoint problems and quickly implement solutions.

Now, you can test it out and start building on GitHub with o1-preview and o1-mini. During the preview, you can choose to use o1-preview or o1-mini to power Copilot Chat in VS Code in place of the current default model, GPT-4o. Toggle between models during a conversation, moving from quickly explaining APIs or generating boilerplate code to designing complex algorithms or analyzing logic bugs. Using o1-preview or o1-mini with Copilot gives you a first-hand look at the new models’ ability to tackle complex coding challenges.

You can also test either of the o1 models in the playground in GitHub Models to discover their unique capabilities and performance. And once you’re familiar with how the models work, take the next step and start to integrate the models into your own apps.

Test OpenAI o1 in a playground in the GitHub Marketplace.

With this preview, we’re excited to bring OpenAI’s latest advancements to you, whether you’re developing software along with Copilot or building the next great LLM-based product. We can’t wait to see what you build!

The post Try out OpenAI o1 in GitHub Copilot and Models appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

SQL, NoSQL and Vectors, Oh My!

1 Share
Shot of the corridor in a working data center

Database systems have been fundamental to information technology, supporting everything from basic applications to intricate enterprise systems. They play a crucial role in organizing, storing and retrieving large volumes of data, enabling informed decision-making and strategic planning.

As technology has progressed, database technology has evolved to address the growing complexity and diversity of data management needs — starting with structured SQL databases, moving to NoSQL databases and now advancing to vector databases. Each stage marks a shift in the way data is stored, retrieved and managed. While each database type is tailored for specific applications, the common goal remains: to store, retrieve and manage data efficiently and effectively.

SQL Databases: The Foundation of Structured Data

SQL databases, also known as relational databases, were the first widely adopted database systems, emerging in the 1970s with the development of IBM‘s System R and the theoretical foundation provided by Edgar F. Codd. These databases are built on a structured schema that defines tables, rows and columns to store data. The image below shows an example of a customer table in a relational database.

Figure 1: Customer table in a relational database

Figure 1: Customer table in a relational database.

This rigid structure ensures data integrity and enforces relationships between different data entities.

Let’s take a look at the strengths and limitations of SQL databases.

Strengths of SQL Databases

  • ACID compliance: SQL databases guarantee transactions’ atomicity, consistency, isolation and durability, making them ideal for applications where data integrity is paramount.
  • Complex querying: The structured nature of SQL databases allows for complex queries using SQL (Structured Query Language), which can join multiple tables and retrieve specific data.
  • Mature ecosystem: With decades of development, SQL databases like MySQL, PostgreSQL and Oracle offer robust support, tools and community resources.

Limitations of SQL Databases

  • Scalability challenges: SQL databases often face difficulties with horizontal scaling because they were initially designed to operate on a single server or a closely connected cluster. Although modern SQL databases now support horizontal scaling, implementing and managing them can still be more complex compared to some NoSQL alternatives.
  • Rigid schema: The need to define a schema upfront is a limitation in scenarios where the data structure evolves over time or when dealing with unstructured data.

Despite these limitations, SQL databases remain the go-to choice for applications with well-defined data relationships, such as financial systems, ERP systems and inventory management.

The NoSQL Revolution: Embracing Flexibility and Scalability

In response to the changing needs of modern applications, particularly those requiring handling large volumes of unstructured and semi-structured data such as social media posts, sensor data and web content, NoSQL databases emerged in the early 2000s. Unlike SQL databases, NoSQL databases do not require a fixed schema, allowing them to store data more flexibly.

NoSQL databases come in various forms, including document databases like CouchDB, key-value stores like etcd, column-family stores like Cassandra and graph databases like Neo4j. Take a look at these types of NoSQL databases in the image below:

Figure 2: Types of NoSQL databases

Figure 2: Types of NoSQL databases.

Strengths of NoSQL Databases

  • Horizontal scalability: NoSQL databases are designed to scale out by distributing data across multiple servers, making them ideal for handling large-scale, high-traffic applications.
  • Schema flexibility: The lack of a fixed schema allows for rapid iteration and the ability to store unstructured or semi-structured data, such as JSON, XML or even multimedia files.
  • High availability: Many NoSQL databases prioritize availability and partition tolerance, often sacrificing strict consistency in favor of greater uptime and fault tolerance.

Limitations of NoSQL Databases

  • Eventual consistency: Some NoSQL databases use eventual consistency models, which can lead to temporary discrepancies in data.
  • Lack of standardization: The absence of a standard querying language like SQL makes it challenging to work across different NoSQL systems.

NoSQL databases have become the backbone of many modern web applications, big data platforms and real-time analytics systems, offering the flexibility and scalability that SQL databases often lack.

Vector Databases: Powering the Next Generation of AI

We have seen that the rise of unstructured and semi-structured data led to the rise of No-SQL databases. In modern times, the need to address the complexities and nuances of gaining insights into unstructured data has led to the emergence of new types of databases called vector databases. These databases are specifically designed to store and query vector embeddings, which are mathematical representations of unstructured data like text, images and audio.

What Are Vector Databases?

Vector databases are optimized for managing vector data, which differs from traditional databases’ structured rows and columns. Instead of storing text or numbers in a table, vector databases store dense, high-dimensional vectors generated by AI models. These vectors capture the essence of unstructured data, allowing for powerful similarity searches and data retrieval. A good example of a vector database is Milvus, which is the most popular vector database in terms of GitHub stars. Take a look at the image below that shows how a flower is represented in high-dimensional vectors.

Figure 3: An image represented in vector format

Figure 3: An image represented in vector format.

A crucial feature of vector databases is the approximate nearest neighbor (ANN) search. ANN search enables the system to quickly find vectors most similar to a given query vector, which is essential for applications like image retrieval, recommendation systems and natural language processing.

For instance, an image search engine can retrieve images visually similar to a query image based on the distance between their vector representations in a high-dimensional space. The closer the vectors of an image stored in the vector database are to the query image, the more likely the two images are visually similar.

Benefits of Vector Databases

Vector databases offer several key advantages that make them indispensable in AI-driven applications. Let us take a look at some of these benefits:

  1. Scalability: Vector databases such as Milvus are designed to handle vast amounts of vector data, making them ideal for large-scale AI applications. They can scale horizontally, distributing data across multiple nodes to ensure high availability and fault tolerance.
  2. Efficiency in high-dimensional search: Traditional databases struggle with the complexity of high-dimensional data. Vector databases, on the other hand, are built specifically to perform efficient similarity searches on such data, enabling quick and accurate retrieval of relevant vectors.
  3. Integration with AI pipelines: Vector databases seamlessly integrate with machine learning models and AI pipelines, facilitating the storage, retrieval and processing of vector data. This integration is crucial for developing end-to-end AI solutions that require real-time data processing and analysis.
  4. Enhancing AI with context: In retrieval-augmented generation (RAG) systems, vector databases store domain-specific knowledge externally, supplying the large language model relevant context during generation. This reduces hallucinations in large language models (LLMs) and improves the accuracy of their outputs, especially in applications requiring precise, context-aware responses.

Since RAG is a trending technology, let’s take an in-depth look at how vector databases power this technology.

Vector Databases and Retrieval-Augmented Generation (RAG)

One of the most innovative applications of vector databases is retrieval-augmented generation (RAG), a technique that enhances the capabilities of LLMs by augmenting them with external knowledge. RAG systems combine LLMs’ generative power with vector databases’ retrieval capabilities to produce more accurate and contextually relevant responses.

In a RAG system, the vector database retrieves relevant information that can guide the large language model’s output. For example, when a user queries the system, a vector database retrieves documents or embeddings related to the query. These retrieved vectors provide context or specific information the language model uses to generate a more informed and precise response. This integration is valuable in applications such as customer support, where the ability to provide accurate and context-sensitive responses is critical.

Take a look at the following guide to understand how RAG is used in conjunction with vector databases to build AI apps

Differences Between SQL, NoSQL and Vector Databases

For a more concise comparison between SQL, NoSQL and vector databases, take a look at the table below:

Feature SQL Databases NoSQL Databases Vector Databases
Data Model Relational (tables with rows and columns) Non-relational (document, key-value, graph, etc.) Vector-based (high-dimensional embeddings)
Schema Rigid, predefined schema Flexible, dynamic schema Schema-less; focuses on vector embeddings
Query Language Structured Query Language (SQL) Varies (NoSQL query languages, APIs) Vector search methods (ANN, cosine similarity)
Data Type Focus Structured data Semi-structured and unstructured data Unstructured data represented as vectors
Scalability Vertical scaling (limited horizontal scaling) Horizontal scaling Highly scalable with horizontal distribution
Use Case Examples Transactional systems, analytics Big data, real-time web apps, distributed systems AI/ML applications, similarity searches
Performance Optimized for complex queries, joins Optimized for speed and scalability Optimized for high-dimensional vector similarity search
Typical Applications Banking, ERP, CRM systems Social networks, IoT, content management Image retrieval, recommendation engines, NLP, RAG
Storage Format Rows and columns Varies (JSON, BSON, etc.) High-dimensional vectors

We have now examined the evolution of database technology to date. Let us now see what the future of databases might be like.

The Future of Database Technologies

The future of databases lies in the convergence of AI, big data and advanced search capabilities. Vector databases are set to lead this evolution, providing the backbone for AI-driven applications that require high-dimensional data search.

As technologies like RAG mature, databases will integrate more deeply with AI pipelines, enhancing real-time data processing and context-aware responses across industries. This shift will democratize AI, making advanced capabilities more accessible and driving innovation across sectors​.

If you would like to get started learning about how vector databases work and how they power our everyday lives, take a look at this Vector Database 101 series guide.

Conclusion

The evolution of database technology from SQL to NoSQL to vector databases reflects the changing needs of data management in an increasingly complex and data-rich world. SQL databases laid the foundation with their structured approach, ensuring data integrity and enabling complex queries.

NoSQL databases brought flexibility and scalability to handle large volumes of unstructured data, driving modern web applications and real-time analytics. Now, vector databases are emerging as a critical tool in AI-driven applications, powering advanced similarity search capabilities and enhancing AI models with contextual understanding.

As technology advances, vector databases such as Milvus and Zilliz Cloud, fully managed Milvus, will play a pivotal role in the future of AI and data management, offering new ways to store, retrieve and analyze data. The continued integration of AI with databases promises to unlock even greater possibilities, making data-driven insights more accessible and impactful across industries.

The post SQL, NoSQL and Vectors, Oh My! appeared first on The New Stack.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Data Science Pack for VS Code Bundles Python, Data and Copilot Tools

1 Share
New extension pack bundles wildly popular tools for Python development, assisted by the AI-powered GitHub Copilot and a data wrangler.
Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories