This is a NOTICE Please Read.
Incident Report for LivePOS
Postmortem

Folks, I hope everyone are having some quality time with their families and loved ones, I wanted to give you a quick update on the events of the 24th.

As you know, on the 14th we encountered a hardware malfunction from one of our Data Center vendors. The vendor remedied the situation in a matter of hours and we were back in full swing. We thought the issue was resolved for good. On the 24th at approximately 1pm PST our engineers identified the same problem with the same vendor starting to resurface again. Over the course of the next three hours the Vendor’s engineers were coming up with ideas and hypothesis on how to resolve the issue, however no one really gave us a solid answer as to WHY this issue reappeared. The issue was identified as a hardware problem, but it was not clear where exactly the failure is happening.

In order to put this uncertainty behind us, we have decided to migrate our entire server array into a new region, which in essence means that we will be running the entire LivePOS system on a new set of hardware, leaving the “bad” hardware behind. Once the last store was confirmed closed, LivePOS engineers started to migrate the array as planned, that was at approximately 9PM PST. The team worked throughout the night and at approximately 2pm the following day the migration was 97% complete and LivePOS.com, Dashboard and POS transactions were all back online. The rest of the 3% was done that following night, brining the service to 100%.

With the offline feature working as designed, most of you were ringing up customers and collecting payments without delays, however some of you had to close and go home before all of your transactions synced. When the stores re-opened Saturday morning the sales automatically synced and posted on the correct date of the 24th, showing you a blue logo.

With monster companies like Facebook and Netflix (who also uses AWS) going down a few times every year, no cloud solution can be fully immune to this problem, it is a simple fact of life, and while rare, we at LivePOS want to make sure that we are better prepared if this ever happens again.

In the next few weeks we are going to revisit our offline procedures and add enhancements and capabilities to the LivePOS offline system. We will post enhancements and feature updates in our weekly Friday email newsletter. Also, where applicable, we will be reaching out to some of you who were effected more than others to offer some good will compensation.

As I indicated in my last posting, I have been with LivePOS for over 10 years now, and to my knowledge this is the first time in a decade (!) that we had to deal with Data Center issues TWICE in the same month. While this is not comforting in any way, it help to know that these events are rare. Very rare, and like Murphy’s law, they always show up at the worst time.

I want to thank you again for your patience and understanding. In the past few days I have received many customer emails indicating that while the downtime was unpleasant, they were thankful for the constant updates (via our status page) and quick remedy of the situation. Thank you all for your kind words.

I want to wish you and your family a great new year, full of health and wealth.

God bless.

David Miller, Head of IT LivePOS

Posted Dec 27, 2015 - 17:25 PST

Resolved
We are (almost) done!

LivePOS.com, Admin site and POSs are now all back online. There are some final testing we need to do (on things like SMS alerts etc) which we will attend too in the next few hours. Thank you and happy holidays.
Posted Dec 25, 2015 - 14:25 PST
Update
We hope you are enjoying your holiday and spending some much needed time with your families. Our team(s) have been working throughout the night to get everything back online, it looks like we will need another 2-3 hours, which makes it 3pm PST. we will continue to post updates as things progress. Happy holidays everyone :-)
Posted Dec 25, 2015 - 11:55 PST
Update
We are informed that the time given to us by AWS (5-6) will not be sufficient for their engineers to resolve the issue. The new time frame is approximately noon PST on the 25th. We will post updates here as we get more info however we may not update this notice until noon tomorrow.
Posted Dec 25, 2015 - 00:54 PST
Investigating
Dear Customers, We are going to work into the night to remedy the issue some of you encountered today. The LivePOS system will be taken offline tonight, Dec 24th at 9pm PST for approximately 5-6 hours. During this time POS transactions will not sync (but then again all of you are home with your families anyway) and the Admin site will be unavailable. Once the maintenance window is completed everything will come back online. If you have any transactions still un-synced from today, simply click the red logo and your information will automatically sync to the LivePOS Servers. A FULL update on the red logo incident today will be posted once we are done with our investigation.
Posted Dec 24, 2015 - 17:41 PST