Friday, March 8, 2019

Import - export data using Azure blob storage and bcp utility


Skip to end of metadata
Example of the export data file, generated using BCP utility from a local SQL server, upload to Azure blob storage and import to Azure SQL server.
we will do it in the following steps :
  1. Export data from local SQL server 
  2. Create Azure blob storage
  3. Create an Azure blob storage container
  4. Upload exported files to blob storage
  5. Import data from uploaded azure blob storage data file

1. Export data from the local SQL server: 

Using BCP utility we can export.dat file and file format file.

bcp "SELECT Name, Color, Price, Size, Quantity, Data, Tags FROM Product" queryout product.dat -S (localdb)\MSSQLLocalDB -d MemEligibiltyTest -T
Here "product.dat"  is an output filename. 
-S  Sql server hostname
-d  Database Name
-T Trusted connection.
We need to provide a format for each field. bcp asks for each field like below.
Enter prefix-length of field Name 2: 8
Enter field terminator none: none
Enter the file storage type of field Color nvarchar: cr
Enter prefix-length of field Color 2: 8
Enter field terminator none: none
Enter the file storage type of field Price money: cr
Enter prefix-length of field Price 1: 8
Enter field terminator none: none
Enter the file storage type of field Size nvarchar: cr
Enter prefix-length of field Size 2: 8
Enter field terminator none: none
Enter the file storage type of field Quantity int-null: cr
Enter prefix-length of field Quantity 1: 8
Enter field terminator none: none
Enter the file storage type of field Data nvarchar: cr
Enter prefix-length of field Data 2: 8
Enter field terminator none: none
Enter the file storage type of field Tags nvarchar: cr
Enter prefix-length of field Tags 2: 8
Enter field terminator none:
Do you want to save this format information in a file? Y/n y
Host filename bcp.fmt: product.fmt
Starting copy...
4 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total: 15 Average : (266.67 rows per sec.)

As I have provided prefix -length 8 for each field, generated files keeps 8 characters readable instead of binary character. You can see the rest of the characters are in binary format.

It creates the following two files
This creates two files product.dat and product.fmt. 1st is the data file and another is the format file.
product.dat looks like below.
format file.

2. Create Azure blob storage

we will use Powershell for azure connection.
Azure PowerShell works with PowerShell 5.1 or higher on Windows or PowerShell 6 on any platform. To check your PowerShell version, run the command:
$PSVersionTable.PSVersion
You can install Powershell Azure module using the following command
Install-Module -Name Az -AllowClobber

Connect to Azure with a browser sign in token
Connect-AzAccount
o/p:
Account SubscriptionName TenantId Environment
------- ---------------- -------- -----------
Atulxxxxxxx@xxxxxx.com ABCD - Development dxxxxxx3-xxxx-xxxx-xxxx-xxxxxxxxxxxx AzureCloud
As you logged in to Azure and it shows the default subscription. If you want to execute everything in another subscription then need to select that subscription.
You can get an Azure subscription by subscription id using the following command.
PS C:\Windows\system32> $context = Get-AzSubscription -subscriptionId xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
You can store output to a variable and use same variable to set Azure context to select particular azure subscription.
command:
PS C:\Windows\system32> set-AzContext $context
o/p:
Name Account SubscriptionName Environment TenantId
---- ------- ---------------- ----------- --------
ABCD  (xxxxxxxx-xxxx-xxx... Atul.Patel@xxxx... ABCD AzureCloud xxxxxxxx-xxxx-x...
Define a variable for the resource group.
$resourceGroupName="your-resource-group-name"
$blobStorageAccountName="yourblobstorageaccountname"
You can create a New Azure storage account using the following command. You can store azure account context to a variable.
PS C:\Windows\system32> $storageAccount=New-AzStorageAccount -ResourceGroup $resourceGroupName -Name $blobStorageAccountName -skuname Standard_LRS -Location WestUS2
o/p:
StorageAccountName ResourceGroupName Location SkuName Kind AccessTier CreationTime ProvisioningState
------------------ ----------------- -------- ------- ---- ---------- ------------ -----------
yourblobstorageaccountname your-resource-group-name westus2 StandardLRS Storage 3/5/2019 8:55:10 PM Succeeded
If  you already have a azure storage account then you can get it using following command.
PS C:\Windows\system32> $storageAccount=Get-AzStorageAccount -ResourceGroup $resourceGroupName -Name $blobStorageAccountName
PS C:\Windows\system32> $storageAccount
o/p:
StorageAccountName ResourceGroupName Location SkuName Kind AccessTier CreationTime ProvisioningState
------------------ ----------------- -------- ------- ---- ---------- ------------ -----------
yourblobstorageaccountname your-resource-group-name westus2 StandardLRS Storage 3/5/2019 8:55:10 PM Succeeded

3. Create an Azure blob storage container

Storing storage account context into variable
$ctx=$storageAccount.Context

-- Variable for Container name
$containerName="yourcontainername"

 Create new container using following command:
new-AzStorageContainer -Name $containerName -Context $ctx

Name PublicAccess LastModified
---- ------------ ------------
blogstoragename Off 3/5/2019 8:59:56 PM +00:00
context looks like below:

4. Upload exported files to blob storage

You can use the same Azure blob storage container context to upload files.
command:
set-AzStorageblobcontent -File "C:\Users\atul.patel\product.fmt" `
-Container $containerName `
-Blob "product.fmt" `
-Context $ctx

o/p:
Name BlobType Length ContentType LastModified AccessTier SnapshotTime IsDeleted
---- -------- ------ ----------- ------------ ---------- ------------ ---------
product.fmt BlockBlob 755 application/octet-stream 2019-03-05 21:37:54Z Unknown False

Another file.
set-AzStorageblobcontent -File "C:\Users\atul.patel\product.dat" `
-Container $containerName `
-Blob "product.dat" `
-Context $ctx
o/p:
Name BlobType Length ContentType LastModified AccessTier SnapshotTime IsDeleted
---- -------- ------ ----------- ------------ ---------- ------------ ---------
product.dat BlockBlob 456 application/octet-stream 2019-03-05 21:38:08Z Unknown False

Select list of upload files on Azure blob storage container.
PS C:\Windows\system32> Get-AzStorageBlob -Container $containerName -Context $ctx | select Name
Name
----
product.dat
product.fmt

5.Import data from uploaded azure blob storage data file


** Import data into Sql server db from azure blob storage

-- Import into DB
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'XXXXXXXXxxxxxxxxxxxxXXX'

--  Create credential with Azure Blob SAS

CREATE DATABASE SCOPED CREDENTIAL youcredentialname
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2018-03-28&ss=bfqt&srt=sco&sp=xxxxxxxxxx&se=2019-03-06T23:30:14Z&st=2019-03-06T15:30:14Z&spr=https&sig=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
-- NOTE: DO NOT PUT FIRST CHARACTER '?'' IN SECRET!!!

-- Create external data source
CREATE EXTERNAL DATA SOURCE blogstoragename
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://yourblobstorageaccountname.blob.core.windows.net/blogstoragename',
CREDENTIAL= youcredentialname --> CREDENTIAL is not required if a blob storage is public!
);

--Create table

CREATE TABLE dbo.Product(
Name nvarchar(50) NOT NULL,
Color nvarchar(15) NULL,
Price money NOT NULL,
Size nvarchar(5) NULL,
Quantity int NULL,
Data nvarchar(4000) NULL,
Tags nvarchar(4000) NULL
)

BULK INSERT Product
FROM 'product.dat'
WITH ( DATA_SOURCE = 'blogstoragename',
FORMATFILE='product.fmt',
FORMATFILE_DATA_SOURCE = 'blogstoragename',
TABLOCK);
--Just select
SELECT Name, Color, Price, Size, Quantity, Data, Tags
FROM OPENROWSET(BULK 'product.dat',
DATA_SOURCE = 'blogstoragename',
FORMATFILE='product.fmt',
FORMATFILE_DATA_SOURCE = 'blogstoragename') as products;


References:

Monday, January 21, 2019

SQL server JSON using practical scenario with .Net Core, C# and Dapper

Since SQL server 2016, come up with good support of JSON save.  You can find many syntactical tutorials online.  My point to add one more to get an idea where should we use it and where exactly it fits best.
We may have already tasted NoSql DBS like Mongo Db (big list from each cloud vendor).
If you are exposing your data via API as JSON then why not store as JSON.

Lets Jump in example




In my example, I have created a controller of Customer and CustomerContact. I have a service folder and each action has its own service class. I have used Dapper a lightweight ORM to interact with a database. This architecture is inspired by Jimmy Bogard's slice based architecture. I have left a URL to understand sliced based architecture below in the reference section.
In each service, the class has its own Query/Command model class and its own handler. 


Here are my basic model Customer and CustomerContact.


Customer and Customer Contact.

Note: code url is given below in the reference section.

Here we assumed Customer is basically a company. For each customer, It may have multiple CustomerContacts.  You can have different tables in the database for Customer and CustomerContacts and have a strong relationship between them with referential constraint.

For the current example, we can have both Customer and Customer contact in the same table as Object graph by having a CustomerContacts column as a JSON under the Customer table.




Infrastructure background used in the example.

1. Create Customer


Following is my post-action under the controller.


        [HttpPost]
        public async Task<int> Post(Customer customer)
        {
            return  await _createCustomer.Handler(customer);
        }

In the above controller, I have injected ICreateCustomer and assigned it to variable _createCustomer.


CreateCustomer has IDbConnection injected,  which is a SqlConnection instance created on a startup class of asp.net core as transient.
services.AddScoped(typeof(IDbConnection), x => ( new SqlConnection(Configuration.GetConnectionString("SQLJsonDBConnectionString"))));


Command:
We have a Handler method in CreateCustomer service. The handler can handler a command passed as an argument of the method. You can either create separate CreateCustomerCommand and map it to Model attached to a database and expose only those properties to Controller available in command.

Dapper:

Dapper is a lightweight ORM which extends the IDbConnection interface. In the above Create Customer example, We have used a QuerySingleAsync method which can execute SQL statement and return the result as a single value.
You can get more methods from here.:
Url:

In the above example, we have used a simple Insert Query with Output as newly created Id.

Serialize an object to JSON:
As we wanted to store CustomerContact as a JSON, we need to serialize it using JsonConvert.SerializeObject(Object).


Postman

Our CreateCustomer command looks like above from postman.

In a SQL server, it looks like this.


2. Get Customer


We have created a customer, now let's do get Customer.

Query Object:
As we tried to follow the SOLID principle, We will create another Service as GetCustomers and it has its own Query and Response Model.

     public class CustomerResponseModel
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public bool Active { get; set; }
        public string CustomerContacts { get; set; }
    }
    public class Query
    {
        public int Id { get; set; }
    }

SQL:
We will have a simple Select query as a SQL statement with Id as a Parameter passed from the Query object.

Dapper:
QueryAsync with ResponseModel as a generic type parameter gives a list of CustomerResponseMdoel.
Here select fields and ResponseModel property Name should be matched.

Response:
To get response into CustomerContact model, We can do JsonConvert.Deserialize<ResponseModel>.


Postman

The client won't notice what conversion or way we have stored our data into the database. The response looks the same.


3. Create customer as dynamic Customer contact.

In the above example, We have seen that we can Insert data as JSON when we have fixed CustomerContact object properties.  Another advantage of having a JSON format in the database is, we can add any number of properties regardless of any fixed schema.
JSON format data stored is a schema-less data store.


CreateCustomerCommand:
CreateCustomerCommand is just looking like a Customer object only one property for CustomerContacts as dynamic.

SQL: SQL is the same as we did for fixed Schema.

Dapper: QuerySingleAsync with the int return type. Only we have to Serialize a dynamic object to JSON before save it.


Postman: While posting CreateCustomer, you can add any required new properties regardless of any schema.

Disadvantage:  In this approach, Schema normally controlled by UI which is not a good thing. If it exposed without security, it allows any junk data. You have to be careful while implementing this approach.
I still recommend that even if you have a requirement of having dynamic properties, then also have some schema at Model level. You still don't have to worry about schema changes in the database for newly added fields.


4. Get customer as dynamic


Similar way you can get Customer with dynamic CustomerContacts.

with Similar SQL statement, you just need to deserialize JSON into a dynamic object using. JsonConvert.Deserialize<dynamic>.

This is a little safer in case you have a dynamic object with additional properties.


5.  Add Customer contact


We have Customer and we saved CustomerContacts as a JSON array object int SQL. If we have to add one more element into the same JSON array then you can use JSON_Modify() function.

CreateCustomerContactCommand:
This can be command class having properties of CustomerId and CustomerContact.



SQL:
 var sql = @"UPDATE [dbo].[Customer]
                            SET [CustomerContacts] = JSON_MODIFY(CustomerContacts,'append$',JSON_QUERY(@CustomerContact,'$'))
WHERE Id=@CustomerId";

JSON_MODIFY command can add one more element to existing JSON and update the same row.
JSON_QUERY is useful to insert pure JSON object instead of JSON string with "\" escape sequence injected.
More technical details of the function explained here.

https://docs.microsoft.com/en-us/sql/t-sql/functions/json-modify-transact-sql?view=sql-server-2017
https://docs.microsoft.com/en-us/sql/t-sql/functions/json-query-transact-sql?view=sql-server-2017

If  JSON object inserted without JSON_Query then it looks like below highlighted.


Postman:



6.  Get customer contacts


If you wanted to get only Customer contact then you need to use OPENJSON function of SQL server.

GetCustomerContactQuery:
The query has all the parameter you wanted to pass. In our case CustomerId is important.



SQL:


var sql = @"SELECT js.*
                        FROM Customer c
                        CROSS APPLY OPENJSON(c.[CustomerContacts], '$') WITH (
                        Id uniqueidentifier
                        ,Title VARCHAR(10)
                        ,FirstName VARCHAR(50)
                        ,MiddleName VARCHAR(50)
                        ,LastName VARCHAR(50)
                        ,CompanyName VARCHAR(50)
                        ,SalesPerson VARCHAR(100)
                        ,EmailAddress VARCHAR(100)
                        ,Phone VARCHAR(20)
                        ) AS js
                        WHERE c.id = @CustomerId";

You have to Cross apply with OPENJSON on JSON column WIth all columns with datatype and length. Column name should be matched with JSON properties including case.

OPENJSON explained technically here in detail:




JSON result set looks like above when getting each property individually.



If you don't want to use individual properties and may have dynamic schemaless properties then you can use OPENJSON() function without With.

Advantages:

1. Easy storage for Object graph in a single table.
2. Easy to add a schemaless column in SQL server and having NoSql DB features for a smaller portion of an application. No need to switch database.
3. Easy to get records and insert record as JSON and as fixed schema model or dynamic model.

Disadvantage.

1. This feature can not completely replace the NoSql database.
2. If dynamic fields have to rely on schema from UI then it is not good for security reasons.
3. It doesn't have good support of function when you wanted to add a new field in a fixed schema JSON array.
4. In case if you have added big JSON array for a row then it may be a risk of information loss on by mistake edit by user and dba. Its a really good idea to add a temporal table feature to keep all the history of the table in this particular case.

In case you have a column with an array of  JSON and you wanted to add one field of each object of an array then you have to do either string operation or can do approach explained by me on following StackOverflow response given by me.

Example Code: 
https://github.com/atulpatel/SQL_Json

Ref:
https://stackoverflow.com/questions/49828897/update-an-existing-json-value-inside-a-json-array-in-sql
https://docs.microsoft.com/en-us/sql/t-sql/functions/json-modify-transact-sql?view=sql-server-2017
https://docs.microsoft.com/en-us/sql/t-sql/functions/json-query-transact-sql?view=sql-server-2017

Dapper: