블로그 이미지
LifeisSimple

calendar

            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29

Notice

2017. 3. 20. 11:39 Brain Trainning/DataBase

출처 : https://blogs.msdn.microsoft.com/psssql/2016/10/04/default-auto-statistics-update-threshold-change-for-sql-server-2016/

Default auto statistics update threshold change for SQL Server 2016



Lately, we had a customer who contacted us for a performance issue where their server performed much worse in SQL Server 2016 following upgrade.  To show us as an example, he even captured a video.  In the video, he showed that the session that was compiling the query had multiple threads waiting on LATCH_EX of ACCESS_METHODS_DATASET_PARENT.  This type of latch is used to synchronize dataset access among parallel threads.  In general, it deals with large amount of data.   Below is a screenshot from the video. Note that I didn’t include complete columns because I don’t want to reveal customer’s database and user names.   This is very puzzling because we should not see parallel threads during true phases of compiling.

 

image

 

 

After staring at it for a moment, we started to realize that this must have something to do with auto update statistics.  Fortunately, we have a copy of pssdiag captured that include trace data.  To prove that auto update statistics could have caused the issue, we needed to find some evidence of long running auto update stats event.  After importing the data, we were able to find some auto update stats took more than 2 minutes.  These stats update occurred to the queries customer pointed out.  Below is an example of auto update in profiler trace extracted from customer’s data collection.

 

 

image

 

 

Root cause & SQL Server 2016 change

This turned out to be the default auto stats threshold change in SQL 2016.

KB Controlling Autostat (AUTO_UPDATE_STATISTICS) behavior in SQL Server documents two thresholds.  I will call them old threshold and new threshold.

Old threshold: it takes 20% of row changes before auto update stats kicks (there are some tweaks for small tables, for large tables, 20% change is needed).  For a table with 100 million rows, it requires 20 million row change for auto stats to kick in. For vast majority of large tables, auto stats basically doesn’t do much.

New threshold: Starting SQL 2008 R2 SP1, we introduced a trace flag 2371 to control auto update statistics better (new threshold).  Under trace flag 2371, percentage of changes requires is dramatically reduced with large tables.  In other words, trace flag 2371 can cause more frequent update.  This new threshold is off by default and is enabled by the trace flag.  But in SQL 2016, this new threshold is enabled by default for a database with compatibility level 130.

In short:

SQL Server 2014 or below: default is the old threshold.  You can use trace flag 2371 to activate new threshold

SQL Server 2016:  Default is new threshold if database compatibility level is 130.  If database compatibility  is below 130, old threshold is used (unless you use trace flag 2371)

Customer very frequently ‘merge’ data into some big tables. some of them had 300 million rows.  The process triggered much more frequent stats update now because of the threshold change for the large tables.  

 

Solution

The solution is to enable asynchronous statistics update.  After customer implemented this approach, their server performance went back to old level.

 

image

 

Demo of auto stats threshold change


–setup a table and insert 100 million rows
drop database testautostats
go
create database testautostats
go
use testautostats
go
create table t (c1 int)
go
set nocount on
declare @i int
set @i = 0
begin tran
while @i < 100000000
begin
declare @rand int = rand() * 1000000000
    if (@i % 100000 = 0)
    begin
        while @@trancount > 0     commit tran
        begin tran
    end
    insert into t values (@rand)
    set @i  = @i + 1
end
commit tran

go
create index ix on t (c1)
go

 

 

–run this query and query stats property 
–note the last_updated column
select count (*) from t join sys.objects o on t.c1=o.object_id
go
select * from sys.stats st cross apply sys.dm_db_stats_properties (object_id, stats_id) 
where st.object_id = object_id (‘t’)

image

 

–delete 1 million row
–run the same query and query stats property
–note that last_updated column changed
delete top (1000000) from t
go
select count (*) from t join sys.objects o on t.c1=o.object_id

go
select * from sys.stats st cross apply sys.dm_db_stats_properties (object_id, stats_id) 
where st.object_id = object_id (‘t’)

 

image

 

–now switch DB compt level to 120
–delete 1 million row
–note that stats wasn’t updated (last_updated column stays the same)
alter database testautostats SET COMPATIBILITY_LEVEL=120
go
delete top (1000000) from t
go
select * from sys.stats st cross apply sys.dm_db_stats_properties (object_id, stats_id) 
where st.object_id = object_id (‘t’)

 

image


추가 : http://www.sqlservergeeks.com/sql-server-trace-flag-2371-to-control-auto-update-statistics-threshold-and-behavior-in-sql-server/


posted by LifeisSimple

댓글을 달아 주세요

2016. 11. 4. 11:22 Brain Trainning/Cloud
AWS (AWSome Day 2016)

AWS (AWSome Day 2016)

Amazon Web Service

  • DB 서비스 : MySQL, Oracle, SQL Server, PostgreSQL, Amazon Aurora, MariaDB, Multi-AZ Synchronous replication
  • 도입 업체 : 삼성, 미래에셋, 주요 게임사 등등

Amazon RDS (Relational Database Service)

  • Amazon RDS는 Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL 및 MariaDB 중에서 선택할 수 있도록 6개의 익숙한 데이터베이스 엔진을 제공합니다.
  • Read replicas are available in Amazon RDS for MySQL, PostgreSQL, and Amazon Aurora

1교시 : AWS 소개 및 역사

  • 아마존이 잘 만들어서 열심히 서비스 하고 있다.
  • Management Console 제공하니까 잘 써라.

2교시 : AWS 스토리지 및 콘텐츠 전송

1. Simple Storage Service (S3)
  • http://johnsmith.s3.amazonaws.com/photos/puppy.jpg
  • 버킷에 저장할 수 있는 객체 수에는 제한이 없음. 객체 크기 최대 5TB.
  • REST 및 SOAP 인터페이스 제공
2. Amazon Glacier
  • 매우 저렴한 데이터 보관을 위한 Storage 서비스
3. Amazon Elastic Block Store (EBS)
  • Block Level Storage (볼륨 서비스) 최대 16TB까지 생성
  • EC2에 직접 연결하여 볼륨으로 사용.
  • 필요한 경우 스냅샷을 Amazon S3에 저장 해줌.
  • 수명주기 : 생성 -> EC2에 붙임 -> (백업)S3에 스냅샷 저장 -> 분리 -> 삭제
  • GB당 3 IOPS 제공. (용량이 증가하면 IOPS도 증가)
  • 프로비전된 IOPS (PIOPS) : DB등에서 고성능의 IOPS를 요구할때 선택
4. AWS Storage Gateway
  • 아마존에서 다운 받은 하나의 VM(소프트웨어 패키지) 이미지를 iSCSI 연결을 통해 로컬 볼륨처럼 사용할 수 있고 여기에 저장되는 데이터는 아마존의 S3로 자동 저장되는 구조. 서버용 Dropbox 드라이브? (Bandwidth 에 영향을 좀 받을 수 있는)
5. AWS Import/Export
  • Amazon S3 또는 Amazon EBS에서 대량 데이터 송수신 속도를 가속화.
  • 택배(DHL, Fedex 등)로 디스크를 보내면 아마존에서 서버에 붙여서 대량의 데이터를 빠르게 전송할 수 있도록 하는 서비스 - 반대도 가능
  • Export 는 S3에서만 가능
6. Amazon CloudFront (CDN 서비스)
  • 최종 사용자에게 쉽고 비용 효율적으로 콘텐츠를 배포할 수 있는 방법
  • 엣지 로케이션의 글로벌 네트워크를 통해 동적, 정적, 스트리밍 및 대화형 콘텐츠를 포함하는 전체 웹 사이트 전송
7. Amazon Elastic File System (NAS 서비스 - Beta임)
  • 공유 Storage 서비스를 위해 S3 API 를 활용해야 하는 불편함을 EFS 를 통해 NAS 처럼 액세스

3교시 : AWS 컴퓨팅 제품

1. Amazon Elastic Compute Cloud (EC2)
  • 컴퓨팅 요구 사항의 변화에 따라 컴퓨팅 파워를 조정
  • Linux(배포판 선택 가능) 와 Windows 를 선택
  • 사용방법 : AWS 리전(AZ) 선택 -> AMI(AWS Marketplace 에서 선택 가능)에서 EC2 인스턴스 선택 -> CPU, Memory, 네트워크 를 고려하여 인스턴스 유형 선택 -> IP주소, 보안 그룹 등 선택
  • 백업을 위한 스냅샷 (API를 통해 S3의 어느 위치에 저장되었는지 확인가능)
  • AWS 를 가입하면 기본적으로 무료로 사용할 수 있는 프리터어 제공
2. Auto Scaling
  • Amazon EC2 용량을 자동으로 조정 (Scale Out 과 Scale In 자동)
  • 사용자가 지정한 조건(정책)에 의해 자원 사용량에 따라 Instance를 늘림.
  • ELB (Elastic Load Balancing) + CloudWatch (리소스 사용률 모니터링을 통해 사이즈 조절)를 통해서 Auto Scaling 을 제공
3. Amazon Virtual Private Cloud (VPC)
  • 고객이 정의한 가상 네트워크에서 AWS리소스를 시작할 수 있도록 AWS 클라우드상에 프라이빗하고 격리된 공간을 프로비저닝
  • 가상 네트워크 환경을 완벽하게 제어함.
  • Amazon Elastic IP : 고정 IP 서비스
  • Amazon VPC를 암호화된 VPN 연결을 통해 자체 IT 인프라에 연계가능.
4. AWS Direct Connect (전용선 서비스)
  • 고객의 온프레미스 환경(자체 서버 운영 환경)과 AWS간의 전용 네트워크 연결 구축 (물리적인 구축)
  • 1G, 10G만 가능하지만 파트너를 통해 조절 가능
5. Amazon Route 53 (DNS 서비스)
  • 최종 사용자를 인터넷 애플리케이션으로 라우팅 (도메인 이름 등록 제공)
  • 특정 지역에서 들어온 트래픽을 특정 엔드포인트로 전달해주는 기능도 제공
6. Elastic Load Balancing
  • 여러 인스턴스에 걸쳐 트래픽 분산
  • Amazon EC2 인스턴스에 대한 HTTP, HTTPS 및 TCP 트래픽의 라우팅과 밸런싱 지원
  • 제약 조건 : L4 서비스만 할 수 있음. (Port 기반), L7 역할은 하지 않음. EC2 에 대해서만 가능.

4교시 : AWS 관리형 서비스 및 데이터베이스

1. Amazon Relational Database Service (RDS)
  • MySQL, PostgreSQL, Oracle, SQL Sever, Amazon Aurora 지원
  • 이중에서 Replication 은 MySQL, PostgreSQL, Aurora 가 제공
  • 자동으로 데이터베이스 소프트웨어 패치 및 데이터베이스 백업 (Standby 부터 패치 진행함)
  • 한번의 API 호출을 통해 관계형 데이터베이스 인스턴스와 관련된 컴퓨팅 리소스 또는 스토리지 용량을 확장)
  • Amazon RDS의 기본 기능은 VPC 에서 실행되는지와 관계없이 동일. 백업, 소프트웨어 패치 자동 장애 탐지 및 복구를 관리 (Version 패치는 어떻게?)
2. Amazon DynamoDB (NoSQL)
  • 용량의 제한이 없으며, SSD를 사용해 빠르고 예측 가능한 성능을 제공
  • 테이블 별로 필요한 요청 용량을 쉽게 프로비저닝하고 변경 가능. (테이블을 생성하고 용량을 지정)
  • Sharding 은 어떻게? (특정 요구 성능 및 크기에 따라 자동 파티셔닝하여 Sharding 을 함)
3. Amazon ElastiCache (인메모리 서비스)
  • 필요한 용량의 EC2를 선택 -> Cache 엔진 선택 (Memcached, Redis) -> (A-S 구조로 구축 - 알아서)
  • Memcached 호환 및 Redis 호환 온디맨드형 캐시 클러스터
  • 용도 : DB티어 앞에서 캐시(읽기가 많으며 변화가 거의 없는 데이터), 타 AWS Service 와 함께 사용.
  • 읽기 중심의 여러 애플리케이션 워크로드에서 지연 시간을 단축하고 처리량을 높여줌.
4. Amazon Redshift (DW 전용 서비스)
  • 데이터 웨어하우스 클러스터를 설정하고, 운영하며, 확장하는 데 필요한 모든 작업을 관리하는 페타바이트 규모의 서비스
  • 클러스터를 확장해 성능을 높일 수 있는 간단한 방법
  • PostgreSQL 엔진을 활용하여 만든 DW 서비스

5교시 : AWS 배포 및 관리

1. AWS Identity and Access Management (IAM)
  • AWS 사용자 및 그룹을 생성 및 관리
  • 사내 디렉터리와 AWS 서비스 간 자격 증명 연동을 가능하게 함. (LDAP에서 사요한는 그룹과 사용자를 그대로 인증없이 AWS를 사용할 수 있도록 지원)
  • AWS의 리소스에 대한 권한관리도 모두 가능
2. Amazon CloudWatch
  • 리소스 사용률, 운영 성능 및 전체 수요 패턴에 대한 모니터링 (가상화 된 서비스에 대한 모니터링)
  • AWS Management Console, API, SDK 또는 CLI를 통해 엑세스 가능
  • 필요한 지표(아파치의 세션 정보, 특정 프로세스의 메모리, Access Log, Error Log 등)를 CloudWatch로 인터페이스하여 모니터링하고 이벤트 발생 가능.
3. AWS Elastic Beanstalk
  • Web Application 의 배포 및 규모 조정에 특화된 서비스
  • 서비스 플랫폼 선택(Java, Python, Docker, Ruby 등) -> 단일 or Auto Scaling 선택 -> 세부옵션 선택 -> 완료. -> Source Upload
  • Docker 와 유사한 플랫폼?
4. AWS CloudFormation
  • 템플릿 및 연관된 리소스 모음을 배포 및 업데이트하기 위한 매우 유연한 서비스. 리소스 모음을 ‘스택’ 이라고 하고 AWS Management Console, AWS CloudFormation 명령줄 도구 또는 API 사용
  • Elastic Beanstalk 와는 다르게 세부 설정을 해야 하는 서비스 (JSON 형식의 텍스트 파일을 이용)
  • AWS 서비스 대부분을 사용자가 Setting 파일을 통해 생성 배포 가능.
  • CloudFormer를 통해 기존 서비스에서 Formation용 JSON을 뽑아낼 수 있음.
5. AWS OpsWorks
  • Beanstalk 와 CloudFormation 의 중간즈음 되는 서비스. Chef의 관리형 서비스와 유사한. Ruby를 알아야 함.
  • Chef 레시피 및 Bash 스크립트를 를 그대로 활용해 여러 EC2들의 설정을 변경할 수 있음.
  • Python, Java 등으로 만든 서비스 구성 전체를 스택 이라 하고, 이를 다시 DB, WAS 등의 관련 리소스로 계층화 하고, 마지막으로 인스턴스 위에 앱을 서비스 할 수 있도록 함. (스택이란 함께 관리하고자 하는 컴퓨터 인프라와 애플리케이션을 나타냄)

기타

Amazon 소수 오프라인 교육 참가 : 확인 필요 (학습링크 : https://aws.qwiklab.com)

posted by LifeisSimple

댓글을 달아 주세요

2016. 8. 19. 11:00 Brain Trainning/Bigdata

Apache Zeppelin 을 OSX 에 간단히 설치하는 예입니다. 

사실 http://zeppelin.apache.org 에 설치 방법이 잘 설명되어 있습니다. ㅎㅎ

2016:08:19 11:13:15

리눅스에서의 Binary 파일 설치와 그다지 다르지 않습니다.  그리고, 필요하다면 Zeppelin 홈에서 소스를 받아 컴파일해서 사용하는 방법도 크게 어려운 부분은 없습니다. 

그러나, 저는 테스트용도로 사용할 예정이라 binary 를 받아 설치하였습니다. 

  • Curl 을 활용하여 파일을 다운로드 받습니다. 직접 홈에서 받아도 무관합니다. 
    • curl -O "http://mirror.apache-kr.org/zeppelin/zeppelin-0.6.1/zeppelin-0.6.1-bin-all.tgz"
    • tar -zxf zeppelin-0.6.1-bin-all.tgz 

2016:08:19 10:42:49

  • bin 폴더로 이동해 zeppelin-demon.sh 를 시작시켜줍니다.

    ./zeppelin-daemon.sh start

2016:08:19 10:51:37

  • http://localhost:8080 으로 접속하여 테스트합니다.

주의 : 처음 실행시에는 인터프리터 설정이 필요합니다. 특별한 일이 아니라면 별도 세팅이 필요 없이 저장( Save ) 한번만 해주시면 됩니다. 튜토리얼 노트북을 불러온 후 상단을 보면 인터프리터 관련 설명이 있습니다. 조금 내려가다 보면 저장 버튼이 보이는데 이 버튼을 클릭하면 됩니다.  

저는 pyspark 을 사용합니다. 실행되는 Spark job 상태를 확인하려면 http://localhost:4040 으로 접속하면 확인이 가능합니다.

2016:08:19 10:49:56


  • 인터프리터 바인딩은 이렇게 노트북을 열면 아래와 같은 화면이 나타납니다. 

    여기에서 쭈욱 살펴보고 SAVE 버튼을 클릭하면 됩니다. 별도 수정할 내용은 없는 것 같습니다. 혹시 수정할 부분을 놓치고 Save 하여 아래의 설정화면이 안보이면 우측 상단의 설정(톱니바퀴) 버튼을 클릭하면 다시 볼 수 있습니다. 

    2016:08:19 11:31:56

    2016:08:19 11:32:25

    2016:08:19 11:34:10



'Brain Trainning > Bigdata' 카테고리의 다른 글

[Bigdata] Apache Zeppelin 설치 on OSX  (1) 2016.08.19
posted by LifeisSimple

댓글을 달아 주세요

  1. 쉽게 대행하는 방법도 있어요. www.naverpost.co.kr 라는 것인데요.
    지식인 - 좋아요,스크랩,공감
    지도 - 공감,스크랩
    블로그 - 북마크,댓글,공감,스크랩
    포스트 - 스크랩,팔로우,댓글,공감
    준비물 없이 편리하게 이용 가능합니다^^
    카카오톡 - mobilebrain
    네이트온 - mobilebrain@nate.com

2016. 7. 6. 16:17 Brain Trainning/DataBase

Understanding how SQL Server stores data in data files


By:    |   Read Comments   |   Related Tips: More > Database Administration


출처 : https://www.mssqltips.com/sqlservertip/4345/understanding-how-sql-server-stores-data-in-data-files/


Problem

Have you ever thought about how SQL Server stores data in its data files? As you know, data in tables is stored in row and column format at the logical level, but physically it stores data in data pages which are allocated from the data files of the database. In this tip I will show how pages are allocated to data files and what happens when there are multiple data files for a SQL Server database. 

Solution

Every SQL Server database has at least two operating system files: a data file and a log file. Data files can be of two types: Primary or Secondary.  The Primary data file contains startup information for the database and points to other files in the database. User data and objects can be stored in this file and every database has one primary data file. Secondary data files are optional and can be used to spread data across multiple files/disks by putting each file on a different disk drive. SQL Server databases can have multiple data and log files, but only one primary data file. Above these operating system files, there are Filegroups. Filegroups work as a logical container for the data files and a filegroup can have multiple data files.

The disk space allocated to a data file is logically divided into pages which is the fundamental unit of data storage in SQL Server. A database page is an 8 KB chunk of data. When you insert any data into a SQL Server database, it saves the data to a series of 8 KB pages inside the data file. If multiple data files exist within a filegroup, SQL Server allocates pages to all data files based on a round-robin mechanism. So if we insert data into a table, SQL Server allocates pages first to data file 1, then allocates to data file 2, and so on, then back to data file 1 again. SQL Server achieves this by an algorithm known as Proportional Fill.

The proportional fill algorithm is used when allocating pages, so all data files allocate space around the same time. This algorithm determines the amount of information that should be written to each of the data files in a multi-file filegroup based on the proportion of free space within each file, which allows the files to become full at approximately the same time. Proportional fill works based on the free space within a file.

Analyzing How SQL Server Data is Stored

Step 1: First we will create a database named "Manvendra" with three data files (1 primary and 2 secondary data files) and one log file by running the below T-SQL code. You can change the name of the database, file path, file names, size and file growth according to your needs.

CREATE DATABASE [Manvendra]
 CONTAINMENT = NONE
 ON  PRIMARY
( NAME = N'Manvendra', FILENAME = N'C:\MSSQL\DATA\Manvendra.mdf',SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 10MB ),
( NAME = N'Manvendra_1', FILENAME = N'C:\MSSQL\DATA\Manvendra_1.ndf',SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 10MB ),
( NAME = N'Manvendra_2', FILENAME = N'C:\MSSQL\DATA\Manvendra_2.ndf' ,SIZE = 5MB , MAXSIZE = UNLIMITED, FILEGROWTH = 10MB )
 LOG ON
( NAME = N'Manvendra_log', FILENAME = N'C:\MSSQL\DATA\Manvendra_log.ldf',SIZE = 10MB , MAXSIZE = 1GB , FILEGROWTH = 10%)
GO

Step 2: Now we can check the available free space in each data file of this database to track the sequence of page allocations to the data files. There are multiple ways to check such information and below is one option. Run the below command to check free space in each data file.

USE Manvendra
GO
Select DB_NAME() AS [DatabaseName], Name, file_id, physical_name,
    (size * 8.0/1024) as Size,
    ((size * 8.0/1024) - (FILEPROPERTY(name, 'SpaceUsed') * 8.0/1024)) As FreeSpace
    From sys.database_files

You can see the data file names, file IDs, physical name, total size and available free space in each of the database files.

data file free spaces post SQL Server database creation

We can also check how many Extents are allocated for this database. We will run the below DBCC command to get this information. Although this is undocumented DBCC command this can be very useful information.

USE Manvendra
GO
DBCC showfilestats

With this command we can see the number of Extents for each data file. As you may know, the size of each data page is 8KB and eight continuous pages equals one extent, so the size of an extent would be approximately 64KB. We created each data file with a size of 5 MB, so the total number of available extents would be 80 which is shown in column TotalExtents, we can get this by (5*1024)/64.

UsedExtents is the number of extents allocated with data. As I mentioned above, the primary data file includes system information about the database, so this is why this file has a higher number of UsedExtents.

used extents post SQL Server database creation

Step 3: The next step is to create a table in which we will insert data. Run the below command to create a table. Once the table is created we will run both commands again which we ran in step 2 to get the details of free space and used/allocated extents.

USE Manvendra;
GO
CREATE TABLE [Test_Data] (
    [Sr.No] INT IDENTITY,
    [Date] DATETIME DEFAULT GETDATE (),
    [City] CHAR (25) DEFAULT 'Bangalore',
 [Name] CHAR (25) DEFAULT 'Manvendra Deo Singh');

Step 4: Check the allocated pages and free space available in each data file by running same commands from step 2. 

USE Manvendra
Go
Select DB_NAME() AS [DatabaseName], Name, file_id, physical_name,
    (size * 8.0/1024) as Size,
    ((size * 8.0/1024) - (FILEPROPERTY(name, 'SpaceUsed') * 8.0/1024)) As FreeSpace
    From sys.database_files

You can see that there is no difference between this screenshot and the above screenshot except for a little difference in the FreeSpace for the transaction log file.

SQL Server databas file space post table creation

Now run the below DBCC command to check the allocated pages for each data file.

DBCC showfilestats

You can see the allocated pages of each data files has not changed.

used extents post SQL Server table creation

Step 5: Now we will insert some data into this table to fill each of the data files. Run the below command to insert 10,000 rows to table Test_Data.

USE Manvendra
go
INSERT INTO Test_DATA DEFAULT VALUES;
GO 10000

Step 6: Once data is inserted we will check the available free space in each data file and the total allocated pages of each data file.

USE Manvendra
Go
Select DB_NAME() AS [DatabaseName], Name, file_id, physical_name,
    (size * 8.0/1024) as Size,
    ((size * 8.0/1024) - (FILEPROPERTY(name, 'SpaceUsed') * 8.0/1024)) As FreeSpace
    From sys.database_files

You can see the difference between the screenshot below and the above screenshot. Free space in each data file has been reduced and the same amount of space has been allocated from both of the secondary data files, because both files have the same amount of free space and proportional fill works based on the free space within a file. 

SQL Server Database File space post data insert

Now run below DBCC command to check the allocated pages for each data files.

DBCC showfilestats

You can see a few more pages have been allocated for each data file. Now the primary data file has 41 extents and the secondary data files have a total of 10 extents, so total data saved so far is 51 extents. Both secondary data files have the same number of extents allocated which proves the proportional fill algorithm.

used extents post SQL Server data insert

Step 7: We can also see where data is stored for table "Test_Data" for each data file by running the below DBCC command. This will let us know that data is stored on all data files.

DBCC IND ('Manvendra', 'Test_data', -1);

I attached two screenshots because the number of rows was very large to show all data file IDs where data has been stored. File IDs are shown in each screenshot, so we can see each data page and their respective file ID. From this we can say that table Test_data is saved on all three data files as shown in the following screenshots.

SQL Server data table saved on which data files



Data table saved on particular SQL Server data files

Step 8: We will repeat the same exercise again to check space allocation for each data file. Insert an additional 10,000 rows to the same table Test_Data to check and validate the page allocation for each data file. Run the same command which we ran in step 5 to insert 10,000 more rows to the table test_data. Once the rows have been inserted, check the free space and allocated extents for each data file.

USE Manvendra
GO
INSERT INTO Test_DATA DEFAULT VALUES;
GO 10000
Select DB_NAME() AS [DatabaseName], Name, file_id, physical_name,
    (size * 8.0/1024) as Size,
    ((size * 8.0/1024) - (FILEPROPERTY(name, 'SpaceUsed') * 8.0/1024)) As FreeSpace
    From sys.database_files

We can see again both secondary data files have the same amount of free space and similar amount of space has been allocated from the primary data file as well. This means SQL Server uses a proportional fill algorithm to fill data in to the data files.

SQL Server database file space post data insert

We can get the extent information again for the data files.

DBCC showfilestats

Again we can see in increase in the UsedExtents for all three of the data files.

used extents post SQL Server data insert
Next Steps
  • Create a test database and follow these steps, so you can better understand how SQL Server stores data at a physical and logical level. 
  • Explore more knowledge with SQL Server Database Administration Tips


posted by LifeisSimple
TAG MSSQL

댓글을 달아 주세요

2016. 6. 22. 09:57 Brain Trainning/DataBase



Graphing MySQL performance with Prometheus and Grafana


출처 : https://www.percona.com/blog/2016/02/29/graphing-mysql-performance-with-prometheus-and-grafana/


   | February 29, 2016 |  Posted In: MonitoringMySQLPrometheus

This post explains how you can quickly start using such trending tools as Prometheus and Grafana for monitoring and graphing of MySQL and system performance.

First of all, let me mention that Percona Monitoring and Management beta has been released recently which is an easy way you can get all of this.

I will try to keep this blog as short as possible, so you can quickly set things up before getting bored. I plan to cover the details in the next few posts. I am going to go through the installation process here in order to get some really useful and good-looking graphs in the end.

Overview

PrometheusPrometheus is an open-source service monitoring system and time series database. In short, the quite efficient daemon scrapes metrics from remote machines using HTTP protocol and stores data in the local time-series database. Prometheus provides a simple web interface, a very powerful query language, HTTP API etc. However, the storage is not designed to be durable for the time being.

The remote machines need to run exporters to expose metrics to Prometheus. We will be using the following two:

GrafanaGrafana is an open source, feature-rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB. It is a powerful tool for visualizing large-scale measurement data and designed to work with time-series. Grafana supports different types of graphs, allows for custom representation of individual metrics on the graph and various methods of authentication including LDAP.

Diagram

Here is a diagram of the setup we are going to use:
Prometheus + Grafana diagram

Prometheus setup

To install on the monitor host.

Get the latest tarball from Github.

Create a simple config:

where 192.168.56.107 is the IP address of the db host we are going to monitor and db1 is its short name. Note, the “alias” label is important here because we rely on it in the predefined dashboards below to get per host graphs.

Start Prometheus in foreground:

Now we can access Prometheus’ built-in web interface by http://monitor_host:9090

Prometheus web interface
If you look at the Status page from the top menu, you will see that our monitoring targets are down so far. Now let’s setup them – prometheus exporters.

Prometheus exporters setup

Install on the db host. Of course, you can use the same monitor host for the experiment. Obviously, this node must run MySQL.

Download exporters from here and there.

Start node_exporter in foreground:

Unlike node_exporter, mysqld_exporter wants MySQL credentials. Those privileges should be sufficient:

Create .my.cnf and start mysqld_exporter in foreground:

At this point we should see our endpoints are up and running on the Prometheus Status page:
Prometheus status page

Grafana setup

Install on the monitor host.

Grafana has RPM and DEB packages. The installation is as simple as installing one package.
RPM-based system:

or APT-based one:

Open and edit the last section of /etc/grafana/grafana.ini resulting in the following ending:

Percona has built the predefined dashboards for Grafana with Prometheus for you.

Let’s get them deployed:

It is important to apply the following minor patch on Grafana 2.6 in order to use the interval template variable to get the good zoomable graphs. The fix is simply to allow variable in Step field on Grafana graph editor page. For more information, take a look at PR#3757 and PR#4257. We hope the last one will be released with the next Grafana version.

Those changes are idempotent.

Finally, start Grafana:

At this point, we are one step before being done. Login into Grafana web interface http://monitor_host:3000 (admin/admin).

Go to Data Sources and add one for Prometheus:
Grafana datasource

Now check out the dashboards and graphs. Say choose “System Overview” and period “Last 5 minutes” on top-right. You should see something similar:
Grafana screen
If your graphs are not populating ensure the system time is correct on the monitor host.

Samples

Here are some real-world samples (images are clickable and scrollable):
 
 
 
 

Enjoy!

Conclusion

Prometheus and Grafana is a great tandem for enabling monitoring and graphing capabilities for MySQL. The tools are pretty easy to deploy, they are designed for time series with high efficiency in mind. In the next blog posts I will talk more about technical aspects, problems and related stuff.


posted by LifeisSimple

댓글을 달아 주세요

2016. 6. 6. 23:22 Brain Trainning/DataBase


Graphing MySQL performance with Prometheus and Grafana

   | February 29, 2016 |  Posted In: MonitoringMySQLPrometheus

출처 : https://www.percona.com/blog/2016/02/29/graphing-mysql-performance-with-prometheus-and-grafana/

This post explains how you can quickly start using such trending tools as Prometheus and Grafana for monitoring and graphing of MySQL and system performance.

First of all, let me mention that Percona Monitoring and Management beta has been released recently which is an easy way you can get all of this.

I will try to keep this blog as short as possible, so you can quickly set things up before getting bored. I plan to cover the details in the next few posts. I am going to go through the installation process here in order to get some really useful and good-looking graphs in the end.

Overview

PrometheusPrometheus is an open-source service monitoring system and time series database. In short, the quite efficient daemon scrapes metrics from remote machines using HTTP protocol and stores data in the local time-series database. Prometheus provides a simple web interface, a very powerful query language, HTTP API etc. However, the storage is not designed to be durable for the time being.

The remote machines need to run exporters to expose metrics to Prometheus. We will be using the following two:

GrafanaGrafana is an open source, feature-rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB. It is a powerful tool for visualizing large-scale measurement data and designed to work with time-series. Grafana supports different types of graphs, allows for custom representation of individual metrics on the graph and various methods of authentication including LDAP.

Diagram

Here is a diagram of the setup we are going to use:
Prometheus + Grafana diagram

Prometheus setup

To install on the monitor host.

Get the latest tarball from Github.

Create a simple config:

where 192.168.56.107 is the IP address of the db host we are going to monitor and db1 is its short name. Note, the “alias” label is important here because we rely on it in the predefined dashboards below to get per host graphs.

Start Prometheus in foreground:

Now we can access Prometheus’ built-in web interface by http://monitor_host:9090

Prometheus web interface
If you look at the Status page from the top menu, you will see that our monitoring targets are down so far. Now let’s setup them – prometheus exporters.

Prometheus exporters setup

Install on the db host. Of course, you can use the same monitor host for the experiment. Obviously, this node must run MySQL.

Download exporters from here and there.

Start node_exporter in foreground:

Unlike node_exporter, mysqld_exporter wants MySQL credentials. Those privileges should be sufficient:

Create .my.cnf and start mysqld_exporter in foreground:

At this point we should see our endpoints are up and running on the Prometheus Status page:
Prometheus status page

Grafana setup

Install on the monitor host.

Grafana has RPM and DEB packages. The installation is as simple as installing one package.
RPM-based system:

or APT-based one:

Open and edit the last section of /etc/grafana/grafana.ini resulting in the following ending:

Percona has built the predefined dashboards for Grafana with Prometheus for you.

Let’s get them deployed:

It is important to apply the following minor patch on Grafana 2.6 in order to use the interval template variable to get the good zoomable graphs. The fix is simply to allow variable in Step field on Grafana graph editor page. For more information, take a look at PR#3757 and PR#4257. We hope the last one will be released with the next Grafana version.

Those changes are idempotent.

Finally, start Grafana:

At this point, we are one step before being done. Login into Grafana web interface http://monitor_host:3000 (admin/admin).

Go to Data Sources and add one for Prometheus:
Grafana datasource

Now check out the dashboards and graphs. Say choose “System Overview” and period “Last 5 minutes” on top-right. You should see something similar:
Grafana screen
If your graphs are not populating ensure the system time is correct on the monitor host.

Samples

Here are some real-world samples (images are clickable and scrollable):
 
 
 
 

Enjoy!

Conclusion

Prometheus and Grafana is a great tandem for enabling monitoring and graphing capabilities for MySQL. The tools are pretty easy to deploy, they are designed for time series with high efficiency in mind. In the next blog posts I will talk more about technical aspects, problems and related stuff.


posted by LifeisSimple
TAG Grafana, MySQL

댓글을 달아 주세요

2016. 6. 6. 22:11 Brain Trainning/DataBase

50 Important Queries in SQL Server


출처 : http://www.c-sharpcorner.com/article/50-important-queries-in-sql-server/


In this article I will explain some general purpose queries. I think each developer should have knowledge of these queries. These queries are not related to any specific topic of SQL. But knowledge of such queries can solve some complex tasks and may be used in many scenarios, so I decided to write an article on these queries.

Query 1: Retrieve List of All Database

  1. EXEC sp_helpdb  

Example:

Example

Query 2: Display Text of Stored Procedure, Trigger, View 

  1. exec sp_helptext @objname = 'Object_Name'  

Example:

Example

Query 3: Get All Stored Procedure Relate To Database

  1. SELECT DISTINCT o.name, o.xtype  
  2.   
  3. FROM syscomments c  
  4.   
  5. INNER JOIN sysobjects o ON c.id=o.id  
  6.   
  7. WHERE o.xtype='P'  

Example:

Example

To retrieve the View use “V” instead of “P” and for functions use “FN.

Query 4: Get All Stored Procedure Relate To Table

  1. SELECT DISTINCT o.name, o.xtype  
  2.   
  3. FROM syscomments c  
  4.   
  5. INNER JOIN sysobjects o ON c.id=o.id  
  6.   
  7. WHERE c.TEXT LIKE '%Table_Name%' AND o.xtype='P'  

Example:

Example

To retrieve the View use “V” instead of “P” and for functions use “FN.

Query 5: Rebuild All Index of Database

  1. EXEC sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?', ' ', 80)"  
  2.   
  3. GO  
  4.   
  5. EXEC sp_updatestats  
  6.   
  7. GO  

Example:

Example

Query 6: Retrieve All dependencies of Stored Procedure: 

This query return all objects name that are using into stored procedure like table, user define function, another stored procedure.

Query:

  1. ;WITH stored_procedures AS (  
  2.   
  3. SELECT  
  4.   
  5. oo.name AS table_name,  
  6.   
  7. ROW_NUMBER() OVER(partition by o.name,oo.name ORDER BY o.name,oo.nameAS row  
  8.   
  9. FROM sysdepends d  
  10.   
  11. INNER JOIN sysobjects o ON o.id=d.id  
  12.   
  13. INNER JOIN sysobjects oo ON oo.id=d.depid  
  14.   
  15. WHERE o.xtype = 'P' AND o.name LIKE '%SP_NAme%' )  
  16.   
  17. SELECT Table_name FROM stored_procedures  
  18.   
  19. WHERE row = 1  

Example:

Example

Query 7: Find Byte Size Of All tables in database

  1. SELECT sob.name AS Table_Name,  
  2.   
  3. SUM(sys.length) AS [Size_Table(Bytes)]  
  4.   
  5. FROM sysobjects sob, syscolumns sys  
  6.   
  7. WHERE sob.xtype='u' AND sys.id=sob.id  
  8.   
  9. GROUP BY sob.name  

Example:

Example

Query 8: Get all table that don’t have identity column:

Query:

  1. SELECT  
  2.   
  3. TABLE_NAME FROM INFORMATION_SCHEMA.TABLES  
  4.   
  5. where  
  6.   
  7. Table_NAME NOT IN  
  8.   
  9. (  
  10.   
  11. SELECT DISTINCT c.TABLE_NAME FROM INFORMATION_SCHEMA.COLUMNS c  
  12.   
  13. INNER  
  14.   
  15. JOIN sys.identity_columns ic  
  16.   
  17. on  
  18.   
  19. (c.COLUMN_NAME=ic.NAME))  
  20.   
  21. AND  
  22.   
  23. TABLE_TYPE ='BASE TABLE'  

Example:

Example

Query 9: List of Primary Key and Foreign Key for Whole Database

  1. SELECT  
  2.   
  3. DISTINCT  
  4.   
  5. Constraint_Name AS [Constraint],  
  6.   
  7. Table_Schema AS [Schema],  
  8.   
  9. Table_Name AS [TableName] FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE  
  10.   
  11. GO  

Example:

Example

Query 10: List of Primary Key and Foreign Key for a particular table

  1. SELECT  
  2.   
  3. DISTINCT  
  4.   
  5. Constraint_Name AS [Constraint],  
  6.   
  7. Table_Schema AS [Schema],  
  8.   
  9. Table_Name AS [TableName] FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE  
  10.   
  11. WHERE INFORMATION_SCHEMA.KEY_COLUMN_USAGE.TABLE_NAME='Table_Name'  
  12.   
  13. GO  

Example:

Example

Query 11: RESEED Identity of all tables

  1. EXEC sp_MSForEachTable '  
  2.   
  3. IF OBJECTPROPERTY(object_id(''?''), ''TableHasIdentity'') = 1  
  4.   
  5. DBCC CHECKIDENT (''?'', RESEED, 0)  

Example:

Example

Query 12: List of tables with number of records

  1. CREATE TABLE #Tab  
  2.   
  3. (  
  4.   
  5. Table_Name [varchar](max),  
  6.   
  7. Total_Records int  
  8.   
  9. );  
  10.   
  11. EXEC sp_MSForEachTable @command1=' Insert Into #Tab(Table_Name, Total_Records) SELECT ''?'', COUNT(*) FROM ?'  
  12.   
  13. SELECT * FROM #Tab t ORDER BY t.Total_Records DESC;  
  14.   
  15. DROP TABLE #Tab;  

Example:

Example

Query 13: Get the version name of SQL Server

  1. SELECT @@VERSION AS Version_Name  

Example:

Example

Query 14: Get Current Language of SQL Server

  1. SELECT @@LANGUAGE AS Current_Language;  

Example:

Example
Query 15: Disable all constraints of a table

  1. ALTER TABLE Table_Name NOCHECK CONSTRAINT ALL  

Example:

Example

Query16: Disable all constraints of all tables

  1. EXEC sp_MSForEachTable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL'  

Example:

ExampleQuery 17: Get Current Language Id

  1. SELECT