Search This Blog

Monday, November 10, 2014

What is the difference between a Clustered and Non Clustered Index?

What is the difference between a Clustered and Non Clustered Index?


A clustered index determines the order in which the rows of a table are stored on disk. If a table has a clustered index, then the rows of that table will be stored on disk in the same exact order as the clustered index.

An example will help clarify what we mean by that.


Suppose we have a table named Employee which has a column named EmployeeID. Let’s say we create a clustered index on the EmployeeID column. What happens when we create this clustered index? Well, all of the rows inside the Employee table will be physically – sorted (on the actual disk) – by the values inside the EmployeeID column. What does this accomplish? Well, it means that whenever a lookup/search for a sequence of EmployeeID’s is done using that clustered index, then the lookup will be much faster because of the fact that the sequence of employee ID’s are physically stored right next to each other on disk – that is the advantage with the clustered index. This is because the rows in the table are sorted in the exact same order as the clustered index, and the actual table data is stored in the leaf nodes of the clustered index.

Remember that an index is usually a tree data structure – and leaf nodes are the nodes that are at the very bottom of that tree. In other words, a clustered index basically contains the actual table level data in the index itself. This is very different from most other types of indexes as you can read about below.

When would using a clustered index make sense?


Let’s go through an example of when and why using a clustered index would actually make sense. Suppose we have a table named Owners and a table named Cars. This is what the simple schema would look like – with the column names in each table:


Owners
Owner_Name
Owner_Age
Cars
Car_Type
Owner_Name



Let’s assume that a given owner can have multiple cars – so a single Owner_Name can appear multiple times in the Cars table. Now, let’s say that we create a clustered index on the Owner_Name column in the Cars table. What does this accomplish for us? Well, because a clustered index is stored physically on the disk in the same order as the index, it would mean that a given Owner_Name would have all his/her car entries stored right next to each other on disk. In other words, if there is an owner named “Joe Smith” or “Raj Gupta”, then each owner would have all of his/her entries in the Cars table stored right next to each other on the disk.

When is using a clustered index an advantage?


What is the advantage of this? Well, suppose that there is a frequently run query which tries to find all of the cars belonging to a specific owner. With the clustered index, since all of the car entries belonging to a single owner would be right next to each other on disk, the query will run much faster than if the rows were being stored in some random order on the disk. And that is the key point to remember!

Why is it called a clustered index?


In our example, all of the car entries belonging to a single owner would be right next to each other on disk. This is the “clustering”, or grouping of similar values, which is referred to in the term “clustered” index.

Note that having an index on the Owner_Name would not necessarily be unique, because there are many people who share the same name. So, you might have to add another column to the clustered index to make sure that it’s unique.

What is a disadvantage to using a clustered index?


A disadvantage to using a clustered index is the fact that if a given row has a value updated in one of it’s (clustered) indexed columns what typically happens is that the database will have to move the entire row so that the table will continue to be sorted in the same order as the clustered index column. Consider our example above to clarify this. Suppose that someone named “Rafael Nadal” buys a car – let’s say it’s a Porsche – from “Roger Federer”. Remember that our clustered index is created on the Owner_Name column. This means that when we do a update to change the name on that row in the Cars table, the Owner_Name will be changed from “Roger Federer” to “Rafael Nadal”.

But, since a clustered index also tells the database in which order to physically store the rows on disk, when the Owner_Name is changed it will have to move an updated row so that it is still in the correct sorted order. So, now the row that used to belong to “Roger Federer” will have to be moved on disk so that it’s grouped (or clustered) with all the car entries that belong to “Rafael Nadal”. Clearly, this is a performance hit. This means that a simple UPDATE has turned into a DELETE and then an INSERT – just to maintain the order of the clustered index. For this exact reason, clustered indexes are usually created on primary keys or foreign keys, because of the fact that those values are less likely to change once they are already a part of a table.

A comparison of a non-clustered index with a clustered index with an example


As an example of a non-clustered index, let’s say that we have a non-clustered index on the EmployeeID column. A non-clustered index will store both the value of the EmployeeID AND a pointer to the row in the Employee table where that value is actually stored. But a clustered index, on the other hand, will actually store the row data for a particular EmployeeID – so if you are running a query that looks for an EmployeeID of 15, the data from other columns in the table like EmployeeName, EmployeeAddress, etc. will all actually be stored in the leaf node of the clustered index itself.



This means that with a non-clustered index extra work is required to follow that pointer to the row in the table to retrieve any other desired values, as opposed to a clustered index which can just access the row directly since it is being stored in the same order as the clustered index itself. So, reading from a clustered index is generally faster than reading from a non-clustered index.

A table can have multiple non-clustered indexes


A table can have multiple non-clustered indexes because they don’t affect the order in which the rows are stored on disk like clustered indexes.

Why can a table have only one clustered index?


Because a clustered index determines the order in which the rows will be stored on disk, having more than one clustered index on one table is impossible. Imagine if we have two clustered indexes on a single table – which index would determine the order in which the rows will be stored? Since the rows of a table can only be sorted to follow just one index, having more than one clustered index is not allowed.

Summary of the differences between clustered and non-clustered indexes


Here’s a summary of the differences:


  • A clustered index determines the order in which the rows of the table will be stored on disk – and it actually stores row level data in the leaf nodes of the index itself. A non-clustered index has no effect on which the order of the rows will be stored.
  • Using a clustered index is an advantage when groups of data that can be clustered are frequently accessed by some queries. This speeds up retrieval because the data lives close to each other on disk. Also, if data is accessed in the same order as the clustered index, the retrieval will be much faster because the physical data stored on disk is sorted in the same order as the index.
  • A clustered index can be a disadvantage because any time a change is made to a value of an indexed column, the subsequent possibility of re-sorting rows to maintain order is a definite performance hit.
  • A table can have multiple non-clustered indexes. But, a table can have only one clustered index.
  • Non clustered indexes store both a value and a pointer to the actual row that holds that value. Clustered indexes don’t need to store a pointer to the actual row because of the fact that the rows in the table are stored on disk in the same exact order as the clustered index – and the clustered index actually stores the row-level data in it’s leaf nodes.



Difference between Primary Key and Unique Clustered Index in SQL Server

Difference between Primary Key and Unique Clustered Index in SQL Server


The UNIQUE constraint uniquely identifies each record in a database table.

The UNIQUE and PRIMARY KEY constraints both provide a guarantee for uniqueness for a column or set of columns.

A PRIMARY KEY constraint automatically has a UNIQUE constraint defined on it.

Note that you can have many UNIQUE constraints per table, but only one PRIMARY KEY constraint per table.



You cannot create a unique index on a single column if that column contains NULL in more than one row. Similarly, you cannot create a unique index on multiple columns if the combination of columns contains NULL in more than one row. These are treated as duplicate values for indexing purposes.


A primary key must be unique, but that is just one of the its requirements. Another one would be that it cannot be null, which is not required of a unique constraint.

Also, while, in a way, unique constraints can be used as a poor man's primary keys, using them with IGNORE_DUP_KEY = ON is plainly wrong. That setting means that if you try to insert a duplicate, the insertion will fail silently.



They're definitely different. As mentioned in other answers:

  • Unique key is used just to test uniqueness and nothing else
  • Primary key acts as an identifier of the record.

Also, what's important is that the primary key is usually the clustered index. This means that the records are physically stored in the order defined by the primary key. This has a big consequences for performance.

Also, primary key is automatically included in all other indexes, so getting it doesn't require a record lookup, just reading the index is enough.

To sum up, always make sure you have a primary key on your tables. Indexes have a huge impact on performance and you want to make sure you get your indexes right.


Difference between Primary Key and Unique Clustered Index in SQL Server

Wednesday, July 23, 2014

How can use multiple aggregation in pivot query.

 Syntax for PIVOT.

PIVOT and UNPIVOT relational operators to change a table-valued expression into another table. PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.

SELECT <non-pivoted column>,
    [first pivoted column] AS <column name>,
    [second pivoted column] AS <column name>,
    ...
    [last pivoted column] AS <column name>
FROM
    (<SELECT query that produces the data>)
    AS <alias for the source query>
PIVOT
(
    <aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;

Below table we have Suppliercode and amount. We want change Row data in column means HP, CBS & GRN as column name also find total number of supplier code in table base of unique supplier.
Another output total amount base of unique supplier code.

ID
SupplierCode
Amount
1
HP
29
2
HP
30
3
CBS
40
4
CBS
50
5
HB
50

In below pivot query using multiple aggregation in pivot query for our output.
select * from cost
SELECT * FROM (
select CBS as CBS_Count,HB as HB_Count,HP as HP_Count from (
select count(SupplierCode)  as Code,SupplierCode  from cost
group by SupplierCode) as t
PIVOT (SUM(CODE) FOR SupplierCode IN([CBS],[HB],[HP])) AS PIB) T1,
(select * from
(
SELECT SupplierCode,SUM(Amount) as Total  from cost group by SupplierCode) as T2
pivot
(
SUM(Total) for SupplierCode in([CBS],[HB],[HP])) AS totalamount) T3





OUTPUT:

CBS_Count
HB_Count
HP_Count
CBS
HB
HP
2
1
2
90
50
59


Tuesday, July 22, 2014

How can use sqlCacheDependency for fetching data from data base without page refreshing.

Database name : TQueryNotification
Table name : EMP_Detail , cpyjob.
  
select * from cpyjob

no
job
1
SW
2
TT  
3
bank

select * from cpy.EMP_Detail

sno
name
address
email
phone
pin
1
chandra
delhi
cp@gmail.com
3455
2344
2
ravi
lucknow
rp@gmail.com
45667
2345
3
rohit
delhi
r@gmail.com
3455
344
4
roli
delhi
roli@gmailo.com
35335
353
5
raj1
kanpur
raJ@gmail.com
5556756
44545

Before uesing sqlCacheDependency to use them we first need to set up our database by executing the following scripts:

CREATE ROLE [cpy] AUTHORIZATION [dbo]
GO
 CREATE SCHEMA [cpy] AUTHORIZATION [cpy]
GO
GRANT CREATE PROCEDURE to [cpy];
GRANT CREATE QUEUE to [cpy];
GRANT CREATE SERVICE to [cpy];
GRANT SUBSCRIBE QUERY NOTIFICATIONS TO [cpy];
GRANT VIEW DEFINITION TO [cpy];

GRANT REFERENCES ON CONTRACT::[http://schemas.microsoft.com/SQL/Notifications/PostQueryNotification] TO [cpy];
GRANT RECEIVE ON QueryNotificationErrorsQueue TO [cpy];
GO

SELECT  [no],[job]  FROM [cpy].[cpyjob]   no result

SELECT  [Sno],[Name]  FROM [cpy].[EMP_Detail]  result

SELECT  [no],[job]  FROM [dbo].[cpyjob]    result

Have you ever thought of a scenario in an ASP.NET website that if something is inserted, updated or deleted from the database the ASPX page should auto-reflect it without explicitly refreshing the page? You can using an ajax timer and database polling can easily accomplish the mission, right?

Give it a second thought; is it really a good practice to poll the database periodically? No, its not, because every database call will increase the load on the website. What if we put the data into a cache and poll it periodically for a change in data . Sql server provide we have option QueryNotification and SQLCacheDependency for above task.


In the getdata() function I am executing a select command and populating the data table with the results and assigning it to a cache. If you will see I have created an object of sqlCacheDependency and passed a command object into its constructor. Later this dependency object has been passed to the Cache.Insert method. A timer is periodically checking whether the cache is empty or not. If the cache is empty it will be populated back from the database.Query Notifier that we enabled will notify whether the data has changed or not as soon as the data is changed our the cache will be Null and the freshly changed data will be shown in the grid.

There are the following limitations of SQL Cache Dependency:

Works on plain query not SP
No support of * in the query
Query should not contain UNION, INTERSECT, EXCEPT operators
Query should not contain Distinct Keyword
Limited use of AGGREGATE functions

C# code :


using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Data;
using System.Data.SqlClient;
using System.Web.Caching; 

public partial class _Default : System.Web.UI.Page
{
    string con = "Data Source=HP11-PC\\SQLEXPRESS;Initial Catalog=TQueryNotification;integrated security=true";
  protected void Page_Load(object sender, EventArgs e)
    {
        if (!IsPostBack)
        {
            getdata();
        }
    }

    public void getdata()
    {
        if (HttpRuntime.Cache["Hello1"] == null)
        {
            SqlConnection c = new SqlConnection(con);
            SqlCommand cmd = new SqlCommand("SELECT  [Sno],[Name]  FROM [cpy].[EMP_Detail]", c);
            cmd.Notification = null;
            cmd.NotificationAutoEnlist = true;
            SqlCacheDependency cac1 = new SqlCacheDependency(cmd);
            SqlDependency.Start(con);
            c.Open();
            DataTable dt = new DataTable();
            SqlDataAdapter sda = new SqlDataAdapter(cmd);
            sda.Fill(dt);
            c.Close();
            HttpRuntime.Cache.Insert("Hello1", dt, cac1);
        }
        Load1();
    }
    public void Load1()
    {
        DataTable dt = (DataTable)HttpRuntime.Cache.Get("Hello1");
        GridView1.DataSource = dt;
        GridView1.DataBind();
    }
    protected void Timer1_Tick(object sender, EventArgs e)
    {
        getdata();
    }
}

aspx page code:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default" %>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head id="Head1" runat="server">
    <title></title>
</head>
<body>
    <form id="form1" runat="server">
    <asp:ScriptManager ID="ScriptManager1" runat="server">
    </asp:ScriptManager>
    <asp:UpdatePanel runat="server" ID="upd" UpdateMode="Conditional">
        <ContentTemplate>
            <asp:Timer ID="Timer1" runat="server" Interval="600" OnTick="Timer1_Tick">
            </asp:Timer>
            <div>
                <asp:GridView ID="GridView1" runat="server">
                </asp:GridView>
            </div>
        </ContentTemplate>
    </asp:UpdatePanel>
    </form>
</body>
</html>






How can match particular word with in string.

How can match particular word with in string.

In below example we have room name string, we want categories our room as per category:
Deluxe,Single,Double etc. We have function roomCategory2()  and room name “Deluxe 1 Bedroom Suite Courtyard or City view” we can provide Deluxe category this room.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;

public partial class _Default : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        roomCategory2();
    }
        
   //***********************************************************
    public string roomCategory()
    {
        string[] category = new string[] {"Deluxe","Single","Double"};

        string roomname = "Deluxe 1 Bedroom Suite Courtyard or City view";
        string retVal="";
        foreach (string finalname in category)
        {
             retVal = roomname.Split(" \n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries).First(p => p.Equals(finalname));
           
        }
        Response.Write(retVal);
       
        return "";
     }

//***********************************************************


    public string roomCategory1()
    {
        string[] category = new string[] { "Deluxe", "Single", "Double" };

        string roomname = "Deluxe 1 Bedroom Suite Courtyard or City view";
        string retVal = "";
      
        foreach (string x in category)
        {
            if (x.Contains(roomname))
            {
                Response.Write("true");
            }

            else
            {
                Response.Write("false");
            }
        }
       

        return "";
    }
//***********************************************************

    public string roomCategory2()
     {
        string[] category = new string[] { "Deluxe", "Single", "Double" };

        string roomname = "Single 1 Bedroom Suite Courtyard or City view";
        string[] splitroomname = roomname.Split(" \n\r".ToCharArray());
        string str="";
        foreach (var item in splitroomname)
        {
            foreach(var item1 in category)
              {
                  if (item == item1)
                      str = item1;
              }
           
        }
        Response.Write(str);
        return str;
       
     }
    //***********************************************************
  
}

Monday, July 21, 2014

Calculate number of days between two days in c#

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;

public partial class Default3 : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        string checkoutDate = "04/13/2014";//mm/dd/yyyy
        DateTime checkout = Convert.ToDateTime(checkoutDate);
        DateTime currentDate = DateTime.Now;
        Response.Write("<br>");
        Response.Write(currentDate);
        Response.Write("<br>");
        Response.Write(checkout);
        Response.Write("<br>");
        int dayscount = DateDiffInDays(currentDate,checkout);
        Response.Write("<br>");
       
       
        string script = "<script type='text/javascript'>alert('Search not allowed more then 365 days');</script>";
        if (dayscount > 365)
        {
            Response.Write(script);
        }
        else
        {
            Response.Write(dayscount);
        }
    }

    private int DateDiffInDays(DateTime currentDate, DateTime checkout)
    {

        return checkout.Subtract(currentDate).Days;

    }
}