Sentiment Analysis on iOS Using SwiftUI, Natural Language, and Combine: Hacker News Top Stories
Leveraging Apple’s reactive programming framework for handling asynchronous tasks, while also doing natural language processing in real-time
Powering applications with the ability to understand the natural language of the text always amazes me. Apple made some significant strides with its Natural Language framework last year (2019). Specifically, the introduction of a built-in sentiment analysis feature can only help build smarter NLP-based iOS Applications.
Besides the improvements to the Natural Language framework, SwiftUI and Combine were the two biggies that were introduced during WWDC 2019.
SwiftUI is a declarative framework written in Swift that helps developers build user interfaces quickly. Combine, on the other hand, is Apple’s own reactive programming framework, designed to power modern application development, especially when handling asynchronous tasks.
Our Goal
We’ll be using the Hacker News API to fetch the top stories using a Combine-powered URLSession.
Subsequently, we’ll run Natural Language’s built-in sentiment analysis over the top-level comments of each story to get an idea of the general reaction.
Over the course of the tutorial, we’ll see how reactive programming makes it easier to chain multiple network requests and transform and pass the results to the Subscriber.
Pre-requisite: Having a brief idea about the Combine framework would be helpful. Here’s a piece to kickstart using Combine in Swift.
Getting Started
To start, let’s create a new Xcode SwiftUI project. We’ll be using the official Hacker News API, which offers almost real-time data.
In order to create a SwiftUI List that holds the top stories from Hacker News, we need to set up our ObservableObject
class. This class is responsible for fetching the stories from the API and passing them on to the SwiftUI List. The following code does that for you:
class HNStoriesFeed : ObservableObject{ | |
@Published var storyItems = [StoryItem]() | |
var urlBase = "https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty" | |
var cancellable : Set<AnyCancellable> = Set() | |
private var topStoryIds = [Int]() { | |
didSet { | |
fetchStoryById(ids: topStoryIds.prefix(10)) | |
} | |
} | |
init() { | |
fetchTopStories() | |
} | |
func fetchStoryById<S>(ids: S) where S: Sequence, S.Element == Int{ | |
Publishers.MergeMany(ids.map{FetchItem(id: $0)}) | |
.collect() | |
.receive(on: DispatchQueue.main) | |
.sink(receiveCompletion: { | |
if case let .failure(error) = $0 { | |
print(error) | |
} | |
}, receiveValue: { | |
self.storyItems = self.storyItems + $0 | |
}) | |
.store(in: &cancellable) | |
} | |
func fetchTopStories(){ | |
URLSession.shared.dataTaskPublisher(for: URL(string: "\(urlBase)")!) | |
.map{$0.data} | |
.decode(type: [Int].self, decoder: JSONDecoder()) | |
.sink(receiveCompletion: { completion in | |
switch completion { | |
case .failure(let error): | |
print("Something went wrong: \(error)") | |
case .finished: | |
print("Received Completion") | |
} | |
}, receiveValue: { value in | |
self.topStoryIds = value | |
}) | |
.store(in: &cancellable) | |
} | |
} |
There’s a lot happening in the above code. Let’s break it down:
fetchTopStories
is responsible for returning an array of integer ids for the stories.To save time, we’re passing the top 10 story identifiers to the
fetchStoryById
function, where we’re fetching the Hacker News stories using a custom publisherFetchItem
and merging the results.The
collect()
operator of Combine is responsible for merging all the stories fetched from the API into a single array.
Let’s look at how to construct our custom Combine publisher next.
Creating a Custom Combine Publisher
To create a custom publisher, we need to conform the struct to the Publisher
protocol and set the Output
and Failure
types of the stream as shown below:
struct FetchItem: Publisher { | |
typealias Output = StoryItem | |
typealias Failure = Error | |
let id: Int | |
func receive<S>(subscriber: S) where S: Subscriber, Failure == S.Failure, Output == S.Input { | |
let request = URLRequest(url: URL(string: "https://hacker-news.firebaseio.com/v0/item/\(id).json")!) | |
URLSession.DataTaskPublisher(request: request, session: URLSession.shared) | |
.map { $0.0 } | |
.decode(type: StoryItem.self, decoder: JSONDecoder()) | |
.receive(subscriber: subscriber) | |
} | |
} |
The
id
defined represents the story identifier that’s passed in the initializer.Implementing the
receive(subscriber:)
method is crucial. It connects the publisher to the subscriber, and we need to ensure that the output from the publisher has the same type as the input to the subscriber.Inside the
receive<S>(subscriber: S)
method, we’re making another API request. This time, we’re fetching the story and decoding it using aStoryItem
model, which is defined below:
struct StoryItem : Identifiable, Codable {
let by: String
let id: Int
let kids: [Int]?
let title: String?private enum CodingKeys: String, CodingKey {
case by, id, kids, title
}
}
The array of StoryItems
is then published to the SwiftUI view to get which has a built-in subscriber. The following code is responsible for displaying the Hacker News stories in the SwiftUI list:
struct ContentView: View { | |
@ObservedObject var hnFeed = HNStoriesFeed() | |
var body: some View { | |
NavigationView{ | |
List(hnFeed.storyItems){ articleItem in | |
NavigationLink(destination: LazyView(CommentView(commentIds: articleItem.kids ?? []))){ | |
StoryListItemView(article: articleItem) | |
} | |
} | |
.navigationBarTitle("Hacker News Stories") | |
} | |
} | |
} | |
struct StoryListItemView: View { | |
var article: StoryItem | |
var body: some View { | |
VStack(alignment: .leading) { | |
Text("\(article.title ?? "")") | |
.font(.headline) | |
Text("Author: \(article.by)") | |
.font(.subheadline) | |
} | |
} | |
} |
The NavigationLink
is responsible for taking the user to the destination screen, where the comments are displayed. We’ve wrapped our destination view — CommentView
—in a lazy view. This is done to load the destination views only when the user has navigated to that view. It’s a common pitfall in NavigationLink
s.
Before we jump into the comments section and the subsequent sentiment analysis using NLP, let’s look at what we’ve built so far:
Fetching Hacker News Comments and Analyzing Sentiment Scores
The kids
property in the StoryItem
model contains the ids for the top-level comments. We’ll use a similar approach for multiple network requests as we did earlier, using Combine publishers.
The difference here is the inclusion of Natural Language’s built-in sentiment analysis to give a sentiment score to each comment, followed by calculating the mean sentiment score for that story.
The following code is from the HNCommentFeed
class, which extends the ObservableObject
:
class HNCommentFeed : ObservableObject{ | |
let nlTagger = NLTagger(tagSchemes: [.sentimentScore]) | |
let didChange = PassthroughSubject<Void, Never>() | |
var cancellable : Set<AnyCancellable> = Set() | |
@Published var sentimentAvg : String = "" | |
var comments = [CommentItem](){ | |
didSet { | |
var sumSentiments : Float = 0.0 | |
for item in comments{ | |
let floatValue = (item.sentimentScore as NSString).floatValue | |
sumSentiments += floatValue | |
} | |
let ave = (sumSentiments) / Float(comments.count) | |
sentimentAvg = String(format: "%.2f", ave) | |
didChange.send() | |
} | |
} | |
private var commentIds = [Int]() { | |
didSet { | |
fetchComments(ids: commentIds.prefix(10)) | |
} | |
} | |
func fetchComments<S>(ids: S) where S: Sequence, S.Element == Int{ | |
Publishers.MergeMany(ids.map{FetchComment(id: $0, nlTagger: nlTagger)}) | |
.collect() | |
.receive(on: DispatchQueue.main) | |
.sink(receiveCompletion: { | |
if case let .failure(error) = $0 { | |
print(error) | |
} | |
}, receiveValue: { | |
self.comments = self.comments + $0 | |
}) | |
.store(in: &cancellable) | |
} | |
func getIds(ids: [Int]){ | |
self.commentIds = ids | |
} | |
} |
The comments
property, once fetched from the API using the custom publisher, is manually published by invoking didChange.send()
once we’ve calculated the mean sentiment score and set it on the sentimentAvg
property, which is a @Published
property wrapper itself.
Before we look at the SwiftUI view that holds the comments with their respective scores, let’s look at the custom Combine publisher FetchComment
, as shown below:
struct FetchComment: Publisher { | |
typealias Output = CommentItem | |
typealias Failure = Error | |
//1 | |
let id: Int | |
let nlTagger: NLTagger | |
func receive<S>(subscriber: S) where S: Subscriber, Failure == S.Failure, Output == S.Input { | |
let request = URLRequest(url: URL(string: "https://hacker-news.firebaseio.com/v0/item/\(id).json")!) | |
URLSession.DataTaskPublisher(request: request, session: URLSession.shared) | |
.map { $0.data } | |
.decode(type: CommentItem.self, decoder: JSONDecoder()) | |
.map{ | |
commentItem in | |
//2 | |
let data = Data(commentItem.text?.utf8 ?? "".utf8) | |
var commentString = commentItem.text | |
if let attributedString = try? NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html], documentAttributes: nil) { | |
commentString = attributedString.string | |
} | |
//3 | |
self.nlTagger.string = commentString | |
var sentimentScore = "" | |
if let string = self.nlTagger.string{ | |
//4 | |
let (sentiment,_) = self.nlTagger.tag(at: string.startIndex, unit: .paragraph, scheme: .sentimentScore) | |
sentimentScore = sentiment?.rawValue ?? "" | |
} | |
//5 | |
let result = CommentItem(id: commentItem.id, text: commentString, sentimentScore: sentimentScore) | |
return result | |
} | |
.print() | |
.receive(subscriber: subscriber) | |
} | |
} |
Much like the previous custom publisher, we need to define the Output
and Failure
types. Besides that, we’re doing quite a number of things in the map
operator to transform the CommentItem
into another new instance, which holds the sentiment score as well.
Let’s look at the important ones that are marked with a comment.
Passing the
id
of the comment andnlTagger
instance from theHNCommentFeed
. ThenlTagger
is responsible for segmenting the text into sentence or paragraph units and processing the information in each part. In our case, we’ve set it to process thesentimentScore
, which is a floating-point value between -1 to 1 based on how negative or positive the text is.The comment’s text returned from the API request in the
CommentItem
instance is an HTML string. By retrieving the data part (usingutf8
), we’re converting it into a formatted string, devoid of the HTML escape characters.Next, we’ve set the formatted string on the
nlTagger’s
string
property. This string is analyzed by the linguistic tagger.Finally, we’ve created a new
CommentItem
instance that holds thesentimentScore
. This is result is passed downstream to the subscriber.
The code for the CommentView
SwiftUI struct which holds the comments along with their score is given below:
struct CommentView : View{ | |
@ObservedObject var commentFeed = HNCommentFeed() | |
var body: some View { | |
List(commentFeed.comments){ item in | |
Text(item.sentimentScore) | |
.background(((item.sentimentScore as NSString).floatValue >= 0.0) ? Color.green : Color.red) | |
.frame(alignment: .trailing) | |
Text(item.text ?? "") | |
} | |
.navigationBarTitle("Comment Score \(commentFeed.sentimentAvg)") | |
.navigationBarItems(trailing: (((commentFeed.sentimentAvg as NSString).floatValue >= 0.0) ? Image(systemName: "smiley.fill").foregroundColor(Color.green) : Image(systemName: "smiley.fill").foregroundColor(Color.red))) | |
} | |
init(commentIds: [Int]) { | |
commentFeed.getIds(ids: commentIds) | |
} | |
} |
We’ve set an SF Symbol (new in iOS 13) as the Navigation Bar Button, the color of which represents the overall sentiments of the top-level comments of that story.
As a result, we get the following output in our application:
Conclusion
Using Apple’s built-in sentiment score for NLP, we see that most top stories attract polarizing opinions on Hacker News. While a lot of comments are cryptic, which can cause accuracy issues in the sentiment analysis even custom models, Apple’s built-in sentiment analysis does a fine job. The Natural Language framework has shown some good progress, and there’s a lot more to look forward to in WWDC 2020.
Let’s take a step back and look at what we’ve learned in this piece.
We saw:
How the Combine framework makes it really easy to handle multiple network requests with URLSession. We managed to chain requests, set dependency API requests, and synchronize the API results by avoiding the dreaded callback hell.
How to create custom publishers and ensure that the contract between the publisher and subscriber is maintained (visit the where clause in the
receive
methods).How to use Combine operators to our advantage. We managed to transform a bunch of comments to add an additional property — sentiment score—by performing natural language processing inside the Combine operators.
Moving forward, you can extend the above implementation by adding an endless scrolling functionality. This gives you all the top Hacker News stories. Here’s a good reference for implementing endless scrolling in a SwiftUI-based application.
The full source code of the above application is available in this GitHub repository.
That’s a wrap for this one. Thanks for reading, and I hope you enjoyed the mix of Combine and the Natural Language framework.