Concurrency In Core Data
Generally in iOS the two frameworks that are not thread-safe are UIKit and Core Data. Exceptions to both do exist thought.
In UIKit not all classes are non thread-safe, UIImage is thread-safe and it can be successfully used in a background queue. However all views cannot be used in a background queue, some examples are UILabel and UITableView which if used in the background will result in sporadic and unpredictable app crashes.
Core Data In A Background Queue
Core Data also has exceptions that make it possible, and really useful to use in a background queue.
A few of those tasks include:
- Importing large sets of objects into a database without block the user interface
- Downloading objects from a REST API service and inserting them into a database.
- Saving datasets in the background s it can take a user perceivable amount of time if done in the foreground.
It actually makes a lot of sense to use Core Data concurrently and there are two ways to achieve it:
- The legacy way with manual communication between contexts
- The modern way with parent-child contexts
Legacy: Manual Communication
This way requires we create one separate managed context per thread or queue. then we need to manually send the changes from one context to another. An example would be if a background context is receiving data from a web server and a front-end context is displaying it to a user. We must notify the front-end context when the new objects have downloaded. This is relatively error prone, and there's a stronger way to do it.
Modern: Parent - Child Communication
To make use of concurrent Core Data we need to understand a few things:
- What does it mean that Core Data is not thread-safe
- When creating new objects in a background thread, how does the main queue find out about them so they can be displayed in the UI 3.
Core Data and Thread Safety
Saying Core Data is not thread safe is only half true.
Certain classes most notably NSManagedObject
and NSManagedObjectContext
can only be used in the same queue that they were created in.
Generally speaking if you create a context in the main queue, you can only use it in the main queue. If you use it in different queue than it was create in, low level concurrency bugs will be the result.
Contexts & Queues
To create an instance of NSmanagedObjectContext
we could do something like this:
DispatchQueue.global(qos: .utility).async {
let context = NSManagedObjectContext(concurrencyType: NSManagedObjectContextConcurrencyType.confinementConcurrencyType)
}
That's one way but there's a stronger way. In the above example a queue accesses the NSManagedObject
and hopefully it's the same queue that created it. That approach is slightly unsafe as the programmer is responsible for making sure that the queue that accesses the context is the same one that create it.
To make the example stronger and remove the responsibility put on the programmer Apple came up with a solution by turning that solution inside out
Now, it's the NSManagedObject
that owns a queue and make sure that all it's code runs in that queue. Apple has even given the design pattern of when an object that owns a queue makes sure that all the internal state runs in that queue name, it's called an Actor.
So to create an NSManagedObjectContext
with it's own queue, and run code that uses the context in the correct queue we have to pass the appropriate constant when call init on the NSManagedContext
NSMainQueueConcurrencyType
: this context will run in the main queueNSPrivateQueueConcurrencyType
: this context will run in a queue that it will create by itself. This will be a background context.
// Will run in the main queue
let mainContext = NSManagedObjectContext(concurrencyType: .mainQueueConcurrencyType)
// Will run in the background
let backgroundContext = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType)
To run all the code that uses the context in the right queue we will need to use the correct methods.
The context has two similar methods that take a closure, performBlock
(similar to async
) and performBlockAndWait
(similar to sync
). From there the context will make sure that the closure runs in the right queue.
func backgroundLoad() {
if let coord = self.context.persistentStoreCoordinator {
let bckgContext = NSManagedObjectContext(concurrencyType: .privateQueueConcurrencyType)
bckgContext.persistentStoreCoordinator = coord
bckgContext.perform( {
for i in 1..<100 {
let nb = Notebook(name: "New notebook \(i)", context: bckgContext)
for j in 1..<100 {
let note = Note(text: "Type something \(j)", context: bckgContext)
note.notebook = nb
}
}
})
}
}
Only one thing remains, we need to find a way to send information from one context to the other. In the previous example, once the bckgContext
is done loading Notebooks, we have to notify the front-end context to reload these newly created notebooks into the UI.
Communication Contexts
Background Writer & Foreground Reader
Suppose we have to write an app that displays stock data. It will receive stock prices via a web service with a JSON representation of a stock and it's price. The data will be displayed in a table. The app must also work in offline mode, with the data already downloaded.
Since we need to save the stock data locally, we will use Core Data and we'll need two contexts.
- A context that reads the database and displays the contents in a table, which must run in the main queue.
- A context that retrieves the data from the server, parses the JSON, converts it into
NSManagedObject
instances and writes them into the database, which must run in a private background queue.
The main queue is the reader and the background queue is the writer. For the background queue to notify the main queue that there are new object available we'll use parent-child contexts.
Parent-Child Contexts
We will use a feature of NSManagedObjectContext
, the parent and child properties.
Usually, when a context saves, it pushes its objects to the Persistent Coordinator, which in turn saves them into the database. However, if a context has a parent context, then it has no coordinator. When it saves, the objects are pushed into it's parent context.
This is the information flow from one context to another, from child to parent when the child saves.
Once the objects from the child context are saved into the parent, the parent would send a notification that NSFetchedResultscontroller
would use to refresh the table view.
So, to create the stock app, we would need:
- A parent context running in the main queue responsible for displaying objects in the UI and for saving into the database.
- A child context running in a private queue responsible for receiving new objects from the web server and pushing them into the parent context.
Given what we know we need we should modify our Core Data Stack class so that it has a private backgroundContext, that is a child of the main context. Additionally we need to add a method that allows us to submit a "batch" closure that will run in the background, and once it's done saves from the background context to the main one. This way, as soon as the background batch operation is done all the new objects are available in the main context.
// Within the init: Create a background context child of main context
backgroundContext = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
// MARK: Batch processing in the background
extension CoreDataStack {
typealias Batch = (_ workerContext: NSManagedObjectContext) -> ()
func performBackgroundBatchOperation(_ batch: @escaping Batch) {
backgroundContext.perform() {
batch(self.backgroundContext)
// Save it to the parent context, so normal saving
// can work
do {
try self.backgroundContext.save()
} catch {
fatalError("Error while saving backgroundContext: \(error)")
}
}
}
}
Saving In A Background Queue
It's important to note that saving is an operation that can quickly become a bottleneck. If our model is complex and we have to save frequently, it could start blocking the interface. Saving in the background is one way to mitigate this bottleneck, and to do this we only need to change the architecture of our Core Data Stack a little bit.
Instead of having two contexts (a main context in the main queue and a background context in a private queue), we will have three.
- A persisting context in a private queue. This one saves to the coordinator, and since it's in a private queue, it will save in the background. This is the only responsibility it has and it will do nothing else but saving.
- A main context, which is a child of the persisting one, running in the main queue. This context is responsible for displaying objects in the UI.
- A background context, which is a child of the main context, running in a private queue. This context will be used for background batch jobs.
The only difference is that there's now an extra context between the main context and the coordinator.
So with a few minor modifications to our Core Data Stack we can autosave in a background thread and run batch operations in a private queue.