SQL Server Availability Groups – items to check

When there is an availability group issue

Run the following set of queries on the primary:

SELECT cluster_name,quorum_type_desc,quorum_state_desc FROM sys.dm_hadr_cluster;
SELECT member_name,member_type_desc,member_state_desc,number_of_quorum_votes
FROM sys.dm_hadr_cluster_members
ORDER BY member_name;
SELECT primary_replica,primary_recovery_health_desc,synchronization_health_desc
FROM sys.dm_hadr_availability_group_states;
SELECT * FROM sys.dm_hadr_availability_replica_cluster_nodes ORDER BY replica_server_name;
SELECT A.replica_server_name,A.join_state_desc,B.role_desc,B.operational_state_desc,
B.connected_state_desc,B.recovery_health_desc,B.synchronization_health_desc
FROM sys.dm_hadr_availability_replica_cluster_states A,
sys.dm_hadr_availability_replica_states B
WHERE A.replica_id = B.replica_id and A.group_id = B.group_id
ORDER BY replica_server_name;
SELECT A.replica_server_name,B.database_name,B.is_failover_ready,B.is_database_joined,
C.synchronization_state_desc,C.synchronization_health_desc,C.database_state_desc
FROM sys.dm_hadr_availability_replica_cluster_states A,
sys.dm_hadr_database_replica_cluster_states B,
sys.dm_hadr_database_replica_states C
WHERE A.replica_id = B.replica_id and
B.replica_id = C.replica_id and
B.group_database_id = C.group_database_id
ORDER BY replica_server_name;

and check the following items:

  • SQL Server Errorlogs
  • Windows cluster log – Powershell Get-ClusterLog -> %WINDIR%\cluster\reports -> Cluster.log
  • Windows System event log
  • Clustered diagnostic log files in the SQL Server \LOG directory with file names SRVNAME_SQLINSTANCENAME_SQLDIAG_XXX.XEL. The cluster diagnostic log contents can be viewed and filtered by opening the files in SQL Server Management Studio.

Also check items in

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/troubleshoot-always-on-availability-groups-configuration-sql-server

  1. Accounts – Same domain account+login in master on both servers OR different domain accounts+login in master on both servers+grant the account connect on the mirroring endpoint OR use certificates.
  2. Check mirroring endpoints with correct port and in STATE=STARTED
  3. Check login on other server has connect permission on the mirroring endpoint
  4. Check endpoint URL, fully qualifeid domain name guaranteed to work
  5. Check connectivity to the endpoint port from the other machine in both directions
  6. Check READ_ONLY_ROUTING_URL port connectivity.

and

https://blogs.msdn.microsoft.com/alwaysonpro/2014/11/26/diagnose-unexpected-failover-or-availability-group-in-resolving-state/

  • Open Clustered diagnostic log files in SSMS and filter on state_desc=error
  • Open Cluster diagnostic logs and check for name component_health_result and availability_group_is_alive_failure
  • Open the Cluster Log and check for “is not healthy” and “SQL Server Availability Group”

and

https://support.microsoft.com/en-gb/help/2833707/troubleshooting-automatic-failover-problems-in-sql-server-2012-alwayso?lipi=urn:li:page:d_flagship3_messaging;m05iXFssTryyLKTl1wRM9g%3D%3D

  • Check Windows Cluster Log for failoverCount and check Failover Cluster Manager->Roles->Properties->Failover tab->Maximum Failures in the Specified Period
  • SQL Server Database Engine resource DLL connects to the instance of SQL Server that is hosting the primary replica by using ODBC in order to monitor health. NT AUTHORITY\SYSTEM login account needs Alter Any Availability Group,Connect SQL,View server state on secondary replicas. Check Windows Cluster Log for messages like “Failed to run diagnostics command” and “The user does not have permission to perform this action”
  • Use queries below to check secondary replica is in SYNCHRONIZED status and is_failover_ready=1.

Also https://social.msdn.microsoft.com/Forums/sqlserver/en-US/d9d4589f-2cb5-405d-a8b9-10e9f1230e13/can-not-create-listner-for-high-availability-group-of-always-on-in-sql-2012-on-cluster-environment?forum=sqldisasterrecovery&lipi=urn%3Ali%3Apage%3Ad_flagship3_messaging%3BSKtmCjYBT8mVfIoum8vrqg%3D%3D

  • The attempt to create network name and IP address for the listener is failed.
  • Check that if the ‘Primary DNS suffix of this computer’ is configured correctly
  • Add start up account of cluster service to SQL Server login and grant sysadmin role (Start up account of cluster service will be nt authority\system by default).

Also Failover Cluster Manager->Services and applications->AG Properties->Increase VerboseLogging .

Advertisements

SQL Server on Linux – new command line tools.

I have been playing with the new SQL Server on Linux command line tools

http://smooth1.co.uk/sqlserver2017/LINUX_CMD_TOOLS.html

Very nice that sql-scripter has an option to limit output to a given SQL Server Version and even better Edition although did find an issue with this option and will be providing feedback to Microsoft.

 


SQL Server on Linux – SQL Server 2017 goes cross platform!

Here are the slides for my recent talks on SQL Server on Linux – SQL Server 2017 goes cross platform!

http://smooth1.co.uk/presents/201705_SSOL/201705_SSOL.zip


SQL Server – Changing Recovery Model from Full to Bulk Logged whilst a Tranasction is active.

If have a SQL Server database and change the Recovery Model from Full to Bulk Logged whilst a transaction is open what happens?

If the ongoing transaction which was started under Recovery Model Full does an operation which can be minimally logged what happens?

Does the operation become minimally logged which then means the VLF is tagged as a minimally logged logfile which then does not allow certain operations e.g. restores from the next log backup with STOPAT.

We test with STOPAT and also use fn_dump_dblog to see exactly what ends up in the log backup and how we can identify minimally logged operations in a log backup file.

http://smooth1.co.uk/sqlserver2016/RM_BL_ML.html


SQL Server 2016 – Database Scoped Configuration Parameters and Always On Availability Group failovers.

In SQL Server 2016 we have database scoped parameters.

With an AlwaysOn Availability Groups we can have different database scoped parameter values on the primary compare to the secondaries.

How does this work with an Always On Availability Group failover ?

http://smooth1.co.uk/sqlserver2016/AA_AG_DSP.html


SQL Server – checking for Instant File Initialization Permissions

To check for Instant File Initialization Permissions:

 

NOTE: This script needs to be locally on the machine where you are checking for permissions.

The script assumes the SQL Servers are running under a server where the DisplayName starts with “SQL Server (“, please adapt for your own needs.


SQL Server 2016, upgrading to compatability level 130,Trace flag 139 and additional one-off dbcc checks.

https://support.microsoft.com/en-gb/help/4010261/sql-server-2016-improvements-in-handling-some-data-types-and-uncommon-

When upgrading to SQL Server 2016 RTM CU3/SP1 and upgrading to database compatablity leve 130 there are additional DBCC checks which should be performed.

These are hidden behind Trace flag 139 which should be temporarily enabled as part of the process of changing database compability level to 130.

  • Enable trace flag 139 by running DBCC TRACEON(139, -1).
  • Run DBCC CHECKDB/TABLE..WITH EXTENDED_LOGICAL_CHECKS to validate persisted structures
  • Run DBCC CHECKCONSTRAINTS commands (if rows are affected the associated where clause to identify the row will be returned).
  • Disable trace flag 139 by running DBCC TRACEOFF(139, -1)
  • Change the database compatibility level to 130.
  • REBUILD any structures that you identified in step 1.

There are impovements to expression evaluation in database level 130 and this affects persisted structures

  • Check constraints
  • Persisted Computed columns
  • Indexes using computing columns whether as part of the key or as included columns
  • Filtered indexes
  • Indexed views

Upgrade to compatability level 130 BEFORE attempting to fix issues so the new expression evaluation logic is used for the fixed.

  • Check constraints – change data or drop/recreate constraint with new expression
  • Persisted Computed columns – Update a column referenced by the computed column to the same value to force recalcuation of the computed column
  • Index/filtered index/indexed views – Either A) Put db in single user mode and run DBCC CHECKTABLE with REPAIR_REBUILD B) ALTER INDEX…REBUILD and if supported in your edition of sql server consider adding the WITH (ONLINE=ON) clause.

NOTE: There are some queries in the Appendix C/D of the article above which can be used to help identify affected objects.