Surviving a Reboot of the SCM Client
Applications managers want to be able to reboot edge devices at the end of an SCM deployment. That way, they can apply firmware, operating system, and applications updates that require a device restart. A reboot can occur asynchronously at any time. This time cannot be associated with any internal state change for a particular job on the client because a separate service initiates the reboot after detecting a successful SCM installation. What states can a client be in when a reboot occurs? These states can be idle, receiving a state change, transferring a file, or executing a job.
The reboot persistence feature is enabled by default.
The following types of shutdowns can also cause the device to reboot:
“Orderly”, where the operating system notifies the service or daemon
“Signaled” through some type of semaphore, such as the existence of a file
“Unexpected” shutdown, such as communications or power loss. Note that during an unexpected shutdown, it is possible to lose data.
The Orderly and Signaled shutdowns are the best because the client can prepare for the shutdown. In the event of an Orderly or Signaled shutdown, the client will:
1. Unbind any SCM Things to prevent the receiving of any new updates and suspending the delivery of any new file transfer data.
2. Suspend all state transitions.
3. Kill the running job, if one exists while keeping its state set to TW_SWUPDATE_INSTALLING. This action may leave the job in an indeterminate state, which must be dealt with the next time that the job is run, by the job script.
4. Persist the current job list to disk in a well-known location.
5. Terminate itself.
In the event of an unexpected shutdown, the possibility of data loss exists but is minimized by the reboot persistence feature. The existing API, twSwUpdateManager_RegisterNotification() introduces a handler that is responsible for persisting the entire job list to disk in a well-known location as soon as the state change has occurred. This handler is persistJobStateToDiskHandler(twSwUpdateJob* job);. After the shutdown and restart of the SCM client, you must ensure that the client connects to the ThingWorx Platform BEFORE calling the handler that restores the job list. This handler is restoreStateFromDiskStartupHandler(twSwUpdateManager *swUpdateManager);.
If you do not connect the edge client to the ThingWorx Platform before starting the SCM client, you will encounter up to a 60-second delay while the reboot survival feature times out.
This technique of persisting the list of SCM jobs allows a window of potential state loss if a state change is being received at the moment of shutdown. The same principle applies to any file transfer in progress. While this technique minimizes the failure window, if a state is promoted but not yet persisted, the disconnect between client and server could cause a state synchronization issue, causing one or more jobs to eventually go into the “Failed has issues” state.
Persistence of the Job List
The method of persistence is as a JSON file that takes the form of an array of the actual job data structure. The job list persists by default to the current working directory (application default directory). To override this default, call twScmConfig_SetStagingDir(value), supplying a path as the value.
Once the list is persisted, the persistJobStateToDiskHandler triggers the writing of the file. More events have been added to cover all job list modifications. The use of this handler is optional.
Restoration of the Job List on Restart
Upon restart, the job list can be restored. However, it is very important that the client connect with the ThingWorx Platform BEFORE attempting to restore the job list. Otherwise, if the platform is no longer aware of a job but the client is, the situation can lock up the client. By connecting first, you can ask the platform if the job still matters. You must be connected to the platform for jobs to be restored properly.
Once connected to the platform, calling the restoreStateFromDiskStartupHandler triggers the restoration of the job list to the Software Update Manager.
If a job is in the TW_SWUPDATE_INSTALLING state when the reboot occurs, a restored job in this state is converted to TW_SWUPDATE_COMPLETED.
Reboot Warning File
The reboot persistence feature includes a way to warn the client of an approaching shutdown. You can create a rebootWarningFile. If you designate a full path, including the file name, to the rebootWarningFile, the client will suspend itself, persist its job information and disconnect. It will then delete the rebootWarningFile as a signal that it is now safe to reboot.
There are examples of using the reboot persistence feature in the scmClient example, in the main.c file. At the beginning of the file, the variables for the reboot survival are declared, along with all the other char * variables needed for this client:

char * thingName ,*appKey, *cliAppKey, *envAppKey, *hostname,*whitelistFile,
*stagingDirectory,*logLevelString, *validationSettingsFile,
For purposes of fitting this example on the page, line feeds have been inserted.
The terminate_signal handler is used to respond to an unexpected termination signal (SIGTERM). This response allows a program to persist the job list and disconnect from the platform.

TW_LOG(TW_INFO,"*** Saving job state to disk in response to shutdown request..");
The settings for reboot survival are stored in environment variables, which are checked after parsing parameters entered at the command line for the client:

/* Look for environment only settings */
After setting the log level and getting and verifying that an application key is available, the example starts the SCM extension, and includes the reboot survival file, path to the file, the enabled settings, and the warning path and file name:

/* Begin running the extension with one of the threading engines */
signal(SIGINT , terminate_signal);
signal(SIGTERM , terminate_signal);

For purposes of fitting thi example on the page, the startScm line has been broken up into three lines using line feeds.
The Handlers and Settings for Reboot Survival
The twSwPersist.c and twSwPersist.h files in the SCM Edge Extension installation define the handlers and functions that set parameters for the persistence feature. By default values are pre-set for the reboot persistence feature. The functions are for changing the values, should you require changes. The settings that you can change are mapped to the function that you can use to change the value in the following table:
To Change
Whether the reboot persistence feature is enabled.
twSwUpdate_SetRebootSurvivalEnabled(char enabled);
The name of the JSON file in which the job list is stored
void twSwUpdate_SetRebootSurvivalFileName(char *filename);
The path to the JSON file where the job list is stored
twSwUpdate_SetRebootSurvivalFilePath(char *path);
The path to and the name of the file for warning the client of an imminent shutdown.
void twSwUpdate_SetRebootWarningFileFunction(char * fullPathToFile);
The following table lists and briefly describes these functions, their parameters, and return values (if any):
Reboot Survival Handlers and Functions
Handler/Function | Parameters
Description | Possible Return Values (if any)
void persistJobStateToDiskHandler(twSwUpdate* job)
job — This parameter is not used. It is required by the declaration of twSwUpdate_StateChangeHandler.
A handler compatible with twSwUpdateNotification_AddRemoveStateChangeHandler(), which, if called, will persist all jobs in the software update manager to disk.
void restoreStateFromDiskStartupHandler(twSwUpdateManager *swUpdateManager);
swUpdateManager — The software update manager.
A handler compatible with twSwUpdate_AddRemoveStartupHandler() that is expected to be called when the manager starts up. This handler reads the current reboot survival file back into the manager.
int twSwUpdate_PersistJob(twSwUpdateJob* job,cJSON ** jobObject);
job — A twSwUpdateJob structure from the Software Manager list, updatesInProgress.
jobObject — A pointer to a pointer to a JSON object that will be created by this function.
Creates a new JSON object as jobObject and populates its fields with the values of the job.
Returns TW_OK on success. On failure, returns TW_INVALID_PARAM or TW_ERROR_ALLOCATING_MEMORY
int twSwUpdate_ReadJob(cJSON * jobObject,twSwUpdateJob** job);
jobObject — A JSON object, most likely read from a file as part of an array.
job— A pointer to a pointer to atwSwUpdateJob. This job is allocated and populated.
Converts a JSON object found at jobObject into a twSwUpdateJob. Note that the twSwUpdateJob is allocated and returned in job.
Returns TW_OK on success. On failure, returns TW_INVALID_PARAM or TW_ERROR_ALLOCATING_MEMORY
int twSwUpdate_PersistJobs(const char* pathToJobFile);
pathToJobFile — The path on disk of the file to be created.
Takes all jobs inside the manager updatesInProgress list and persists them as a JSON file that can be used on restart to restore the service state. The manager should be locked before calling this function
Returns TW_OK on success. On failure, returns TW_INVALID_PARAM or TW_ERROR_WRITING_FILE
int twSwUpdate_ReadJobs(const char* pathToJobFile);
pathToJobFile — The path on disk of the file to be read.
Reads and restores the list of SCM jobs inside the job manager from a JSON file
Returns TW_OK on success. On failure, returns TW_INVALID_PARAM or TW_ERROR_READING_FILE
void twSwUpdate_SetRebootSurvivalFilePath(char *path);
path — A new path in which to create a reboot survival file. You are responsible for de-allocation of this string on shutdown.
filename — The name to use for the file.
Used to override the default reboot survival file path. The default path is the current working directory (application default directory), which is assumed to be writable.
void twSwUpdate_SetRebootSurvivalFileName(char *filename);
filename — Set a new name for the reboot survival file. You are responsible for de-allocation of this string on shutdown.
Used to override the default reboot survival file name. The default value is joblist.json.
void twSwUpdate_SetRebootSurvivalEnabled(char enabled);
enabledTRUE to enable these handlers or FALSE to disable them.
Used to disable simultaneously the persistJobStateToDiskHandler() and the restoreStateFromDiskStartUpHandler(). These handlers could also be removed from their respective handler lists.
void twSwUpdate_SetRebootWarningFileFunction(char * fullPathToFile);
fullPathToFile — Set a name for the reboot warning file. You are responsible for deallocation of this string on shutdown.
Used to specify the name of a warning file that can be used to alert the client that a shutdown is about to occur. The client can then prepare for the reboot.
void twSwUpdate_PersistJobStateToDiskNow();
Intended for use in panic situations where a shutdown must occur. Will persist the job list to disk.
void twSwUpdate_DeleteJobListFile();
Intended ONLY for tests that start a manager with no existing job list.
Event Listeners
The SCM Edge Extension provides the following event listeners:
twSwUpdate_AddRemoveStartupHandler() — A startup handler that is called when the SCM client starts. This handler is used to insert the handler that loads saved jobs to disk.
twSwUpdateNotification_AddRemoveStateChangeHandler() — Sends an event for every state change. This handler is used to write the job list file to disk when there is a state change.
twSwUpdateManager_RegisterNotification() — A handler for notifications, derived in very broad terms. The more specific twSwUpdateNotification_AddRemoveStateChangeHandler() is used for the reboot persistence feature.
The Job List File
The default name of the job list file is joblist.json. By default, it is stored in the current working directory (the application default directory). The files, twSwJob.c and twSwJob.h contain the functions used for processing a job. At run time, the log will tell you where the file is written.
The twSwUpdateJob_create() function creates a new ::twSwUpdateJob structure. The structure contains the following parameters:
Parameters in the Job List File Structure
The name of the entity associated with this update.
A ::cJSON structure that contains the information provided by the platform for the new ::twSwUpdateJob structure.
The virtual path to the directory to which the update file(s) would be downloaded.
The function to execute for a download. If the default server file push is used, can be NULL.. WHAT IS the "default server file push"?
The function to execute to perform the installation. Must not be NULL.
This function returns a pointer to the newly allocated ::twSwUpdateJob structure.
The calling function retains ownership of the \p pointer
The calling function gains ownership of the returned ::twSwUpdateJob structure and is responsible for freeing it with twSwUpdateJob_Delete().
The full signature of this function follows:

twSwUpdateJob * twSwUpdateJob_Create(const char * entityName, cJSON * json,
char * vdir, swupdate_func downloadFunc, swupdate_func installFunc);
For publishing purposes, a linefeed and spaces have been added to the signature above. Refer to twSwJob.h in the c subdirectory of the SCM Edge Extension installation.
Here is an example of a joblist.json file:

"id": "1739a009-c1b1-49c4-a5c5-d37b89db89cd",
"entityName": "SampleScmDevice",
"campaignName": "WindowPackage",
"serverUpdateMgr": "TW.RSM.SFW.SoftwareManager",
"downloadTime": 1548350710490,
"installTime": 1548268767811,
"script": "script.bat",
"path": "/",
"state": 5,
"state_string": "downloading",
"prev_state": 5,
"lastActivity": 1548350823914,
"downloadDir": "C:\\Users\\me\\staging\\SampleScmDevice",
"scriptParams": " twId=1739a009-c1b1-49c4-a5c5-d37b89db89cd
twCampaign=WindowPackage twEntity=SampleScmDevice
For purposes for fitting this example on the page, line feeds have been inserted to break up the value of the scriptParams twId property. In an actual JSON file, the value would appear on one continuous line.
Was this helpful?