If your SfB project is heading into the "I need SfB to be highly available" realm, then you need to start investigating what it is going to take to bring just ONE three node pool of SfB front end servers into being. Beyond the obvious need for three servers to form the SfB Enterprise Pool, there is no point in doing an SFB EE Pool and have only one SQL server behind it - you will have created a single point of failure in a critical system component - to whit, the supporting backend SQL databases that the pool members use for re-hydration.
SfB is not totally different from Lync Server 2013 in this regard; however, one key difference is that now SfB officially supports using SQL Server AlwaysOn Availability Group (AG) as a supported SQL backend - this in addition to standalones, clusters, and mirrors. If you have a SQL team that wants to use AG as the standard build, then that is what you wil do, eh? The point to this article is to walk through a simple EE pool with AG installation into an existing Lync 2013 environment to highlight some of the lessons learned over the past month or so.
I am not including screen caps of every step, there are plenty of guides out there in blog-land to walk you through SQL and SfB installs.
Planning
Before cranking up some VM space and mounting the ISO's, you may want to consider some needed data points. DNS, IP space, and construction details are always nice to know ahead of time. Aside from five host servers, their FQDN's and appropriate VM host space, here are a few items to ponder:
- Identify the Windows Failover Cluster (WFC) cluster name and IP - do you need to read up on WFC before you start?
- Identify FSW location - do you need to read up on File Share Witness before you start?
- Identify a SQL file share for enabling the AG itself. SQL is going to want a share that it can use to shuttle the initial database backups into so that it can copy them onto (into?) the target secondary node. Why it just don't do it direct is beyond my ken; I just do what I'm told!
- Identify the SQL AG group listener FQDN and IP - maybe you should do some background on this subject too. It certainly would have helped me a bit.
- Identify the SQL service account - don't try this with "network services" or you will be assigning certificates to logins
- Firewalls on the SQL servers need to opened for inbound traffic to flow properly. 1433 and 5022 TCP; 1434 UDP.
- SQL Database location - these must be IDENTICAL between SQL AG nodes.
- 3 FQDNs and IP to match for the SfB EE pool (maybe more!)
- 2 FQDNs and IP to match for our SQL 2014 Enterprise Edition nodes
- 1 FQDN and IP for the WFC cluster
- 1 FQDN and IP for the SQL AG listener
- Service accounts
- File locations
- Firewalls
- Database location
Prepare two servers for SQL. I used 2 cores and 8GB RAM. Because I am only hosting SfB databases on these servers, I used 200GB for drive C and will put everything on the same drive. You may wish to follow a more esoteric construction with separate drives, perhaps SAN-based, and you may need way more space than that especially if your database team is using these servers for other purposes. You may even have to live with the database team telling you what and where, and by whom. If you are having a SQL team provide you an instance and space, then make SURE of the instance name, your permissions to that instance, and the space your environment will need. Permissions will be important.
And of course, you will need three servers for your SfB pool.
Install SQLAlwaysOn-A operating system (I used server 2012 R2)
Install WFC via server manager
Install .net 3.5
Lather, rinse, repeat for SQLAlwaysOn-B
Patch and then patch again. Dang. You would think doing updates ONCE would be sufficient. But...no.
Install SQL 2014 Enterprise - use the SQL Service account for all services. And you might as well make sure the SQL agent is running. If you don't, the ensuing SfB install will complain about it.
FWIW, I also installed SSRS on both nodes. No, SSRS cannot cluster or failover, but you CAN AG the databases, and install SfB templates to each node, and then, if needed, use the second node for your reports.
Configure WFC (see http://stevenpoitras.com/2014/02/microsoft-failover-cluster-configuration-nutanix/)
Configure FSW quorum - you will need FSW from above.
Configure WFC cluster listener with static IP.
Move resources (like change the active) between nodes - this verifies that both nodes can r/w both DNS and FSW and that either node can be listener.
Leave the WFC cluster active on whichever node you want to be the primary - I use SQLAlwaysOn-A for this.
Ensure that Windows firewall sql inbound rules are done: tcp 1433, 5022, udp 1434
Some Lync 2013 work
Official Lync 2013 documentation on TechNet
From 2013:
Export-csconfiguration
Export-cslisconfiguration
Open topology builder from 2013 and save the tbxml, twice, just to be really sure.
Hey, we get to SfB yet?
In case you need to brush up on all of this…
Install SfB admin tools on something other than an existing Lync 2013 server
Open topology from SfB tools. Save the tbxml. Twice. And not with the same file names you used for saving the 2013 version. No turning back now unless you have the tbxml files and the exports.
You've just upgraded the topology. You did the 2013 tbxml saves and the configuration exports, right?
Configure topology. In my case, a net new EE pool.
Make sure that your primary node SQL server has the database folder already defined, and that you remember what you called it; no point in having the Topology Builder choose defaults if you don't have to. Or worse, use the SQL defaults and have your databases buried about 15 levels down. I used c:\sfbdata.
What you need to do is ensure that the AG listener is defined up top, but the SQLAlwaysOn-A node is defined down below. You go back and change this later... But here is the reason: There is no AG yet, there cannot be an AG created until there are databases. And the databases don’t get created until topology publishes. So put the proposed listener at the top entry, and the specific first node down at the bottom.
Publish topology which will install the databases on sqlalwayson-A.tsoorad.net, which you need to do to get the AG to work - cannot make an AG without having a Database to work with!
When the topology publishes, it will flip up a SQL database configure screen. I ALWAYS put my databases somewhere defined by my project. I NEVER let SQL just throw things around. In this case, we are using c:\sfbdata for all SfB database work. This will include the future CMS and the Persistent Chat.
At this point, I paused and did
"install-csdatabase -centralmanagementdatabase -sqlserverfqdn SQLAlwaysOn-A.tsoorad.net -databasepaths c:\sfbdata"
- because I know I am going to be moving the CMS to the new pool at some point and I want those two databases (xds and lis) to be part of the AG. If you wait until later, you will be doing parts of this all over again. Then I paused again for persistent chat databases:
"Install-CsDatabase -DatabaseType persistentchat -SqlServerFqdn SQLAlwaysOn-A.tsoorad.net -DatabasePaths c:\sfbdata -v"
and
"Install-CsDatabase -DatabaseType persistentchatcompliance -SqlServerFqdn SQLAlwaysOn-A.tsoorad.net -DatabasePaths c:\sfbdata -v"
- same reasoning as before. Tastes great, less filling.
Back to SQL...
OK, now we have SfB databases!
Set all SfB databases to FULL backup – some SQL BrightBoy probably has some zippy tsql to do this; being archaic, I do it onesy-twosy.
Backup all databases using SQL Management Studio - just accept the default location - these are not really your backups, this is just a step to ensure that the AG forms properly. AG requires the databases to be backed up first. Yes, same comment here for the backups. I am sure there is some zippy method that I have never bothered learning.
Robocopy the database file structure - robocopy is your friend - the file structure must be exact between servers
If you installed SSRS on both nodes, so go remove reportserver db and reportservertempdb from SQLAlwaysOn-B.tsoorad.net or you won't be able to add those databases to the AG as the database locations will have files in them and that is a no-no. Luckily, you can just delete the databases from the Management Studio.
Configure AG (see http://stevenpoitras.com/2014/02/configure-sql-db-availability-group/)
You may need to add your cluster nodes by NETBIOS to sys.server by doing
sp_addlinkedserver @server=”serverNETBIOS’'
I don’t know that this is an absolute requirement, but adding my nodes before trying to create the AG seemed to make some errors go away. At any rate, I have done it ever since as a matter of rote. YMMV.
SQL Permissions on Node 2
I have the topology already published, so now go look at the databases that you have already made members of the AG. The AG retains the security logins on the database when it establishes the secondary copy; however the same mechanism does NOT replicate the Master.do database logins; Read this as your logins to the secondary (now primary) will fail for the various RTC and CS groups. You will need to work out a method to get the security logins AND their respective SIDS along with lining up those SIDs to the respective database permissions. One option to perform this work (and have it done right the first time) is to run a script such as this (http://www.sqlsoldier.com/wp/sqlserver/transferring-logins-to-a-database-mirror).
SfB Install
Initial install for the first EE pool members.
Had a nasty time with the prereq script. Had to remount the original O/S ISO before the install-windowsfeature -source would work :( I had a copy of the \sxs local to the machine, but the installer did not like it for some reason. Do Windows updates until it don’t wupdate no mo! Don’t forget kb2982006 is a "hotfix available" special. Until you run the script to install all the Skype for Business operating system prerequisites, the hotfix will refuse to install, so you need to run windows updates AFTER the prerequisite script.
Continued with install. Started seeing issues across the new pool members of speech files not installing.
Finally have EE members installing as expected. Why me? All three EE members acted differently during install. Huh? The servers are as close to identical as we can get them. Installed from same source. Used all default locations. Patched from WSUS source. Each server had the same number of updates in the same order. WTFO.
Persistent Chat
We chose to install Pchat to a single server, but use the existing SQL BE and collocate with the FE Pool databases. See above, eh?
FINALLY
Go back to the topology builder and change the SQL definition for the EE pool - remember up there where we set SQL to the AG AND the single node? Now that we have the pool up we need to tell the pool to talk to the SQL AG listener, not just the one node...
Well, we got done. IMHO, way too much manual effort, but apparently with SQL being the way it is, the product group was forced into a corner. Maybe the future holds an automated version of this, but until then,
YMMV