Systems Interview at a HFT

Eshaan Aggarwal @EshaanAgg
I had the opportunity to interview at a prominent HFT firm in , and I was asked a variety of questions that tested my understanding of C++ and systems programming. I decided to compile all the questions that I was asked during the interview, not only from my interview, but also from the experiences of my peers who interviewed at the same firm, and online sources like Glassdoor and LeetCode. In addition to these questions, I also was asked some primitives or common OS problems from my other blog, so do brush up on those as well.
Table of Contents
Question 1
Give the output of the following program.
#include<bits/stdc++.h>using namespace std;
void fun(int *p, int *q) { int *temp = p; q = p; p = temp; *temp = 2;}
int main() { int r = 20, s = 30; int *p = &r, *q = &s; fun(p, q);
cout << *p << " " << *q << "\n";}
Answer
2 30
Assuming memory addresses in the program and dry running the same with the interviewer might be a good idea to clarify any doubts, and to convey your thought process. It is also particularly helpful in debugging such tricky questions.
The key concept to keep in mind is that pointer values are passed by value in C++. Changes to pointer variables inside a function do not affect the original pointers, but changes to what they point to do affect the original data.
In the fun
function:
- The code swaps the pointers
p
andq
, but since they are passed by value, the original pointers inmain
remain unchanged. - The line
*temp = 2;
modifies the value pointed to bytemp
, which is the same asp
inmain
, changing the value ofr
to2
.
Question 2
What are the issues associated with the following code?
#include <bits/stdc++.h>using namespace std;
const int N = 1000000;
class A { char *str;
public: A() { str = new char[N]; }
A(int n) { str = new char[n]; }
~A() { delete[] str; }};
class B { A a;
public: B(int n) { a = A(n); }}
int main() { B b(10); return 0;};
Answer
Bugs due to implicit copy constructors (or default constructors), memory leaks and double-frees are common concepts tested in such questions. If you find nothing “wrong” in the example, it might be worthwhile to explicitly think about these issues.
The above code contains two errors: a double free and a memory leak.
1. Double Free
class B { A a;
public: B(int n) { a = A(n); // <-- problematic }};
Here in the body of the constructor, first the parametrized constructor of A
is called with n
(due to the call A(n)
), and the object is created. Next, when this object is assigned to the memeber a
, the copy assignment operator (which was implicitly generated) is called. This assignment uses a shallow copy (as default behavior), so both a
and the temporary A(n)
now point to the same str
. When the temporary A(n)
is destroyed at the end of the expression, it’s destructor frees str
. Then, when a
is destroyed in main()
, it again tries to free the same str
, leading to a double free — leading to undefined behavior.
2. Memory Leak
class B { A a;};
When B
is constructed, the member a
is first default-constructed via A()
— which allocates str = new char[N]
. Immediately afterward, a = A(n)
assigns a new allocation to a.str
without freeing the old one, and thus the original allocation (new char[N]
) is now orphaned — there is no pointer pointing to it.
Question 3
What is the output of the following code?
#include<bits/stdc++.h>using namespace std;
class A {public: A() { cout << "A's constructor\n"; }
~A() { cout << "A's destructor\n"; throw runtime_error("Error in A's destructor"); }};
int main() { try { A a; } catch (const runtime_error &e) { cout << "Caught exception: " << e.what() << "\n"; } return 0;}
Answer
Destuctors cannot throw exceptions in C++. Constructors can. If a constructor throws an exception, the destructor will not be called for that object.
Runtime error occurs after printing:A's constructorA's destructor
Destructors in C++ are not allowed to throw exceptions. If an exception is thrown from a destructor during stack unwinding (i.e., when an exception is already being handled), it leads to std::terminate
being called, which results in program termination with a runtime error that cannot be caught. Destructor’s default to noexcept
behavior.
Question 4
What is the output of the following code?
#include<bits/stdc++.h>using namespace std;
class A {public: void fun() { cout << "A"; }};
class B : public A {public: void fun() { cout << "B"; }};
class C : public B {public: virtual void fun() { cout << "C"; }};
class D : public C {public: void fun() { cout << "D"; }};
int main() { D d;
A &a = d; a.fun();
B &b = d; b.fun();
C &c = d; c.fun();
d.fun();
return 0;}
Answer
Using references to upcast is the same concept as using pointers to upcast, and the whole process of creation of the virtual table and
vptr
is the same as well. The only difference is that references are not null, and they cannot be reassigned.
ABDD
In this code, we have a class hierarchy with virtual functions. Since A & B are not virtual, they will call the respective functions in their own class. However, since C is virtual, it will call the overridden function in D.
Question 5
Give the output of the following code:
#include <stdio.h>
char *c[] = {"Systems", "Placements", "IIT", "JEE(Advanced)"};char **cp[] = {c + 3, c + 2, c + 1, c};char ***cpp = cp;
int main() { printf("%s ", *(*++cpp - 1)); printf("%s ", *(*++cpp)); printf("%s ", *(*++cpp + 1));
return 0;}
Answer
Wherever we create an array, the array name is a pointer to the first element of the array!
Placements Placements Placements
The code involves multiple levels of pointers and pointer arithmetic. Let’s break it down:
c
is an array of strings (character pointers).cp
is an array of pointers to pointers, where each element points to a specific string inc
.cpp
is a pointer to the first element ofcp
.
In the main
function:
*++cpp
incrementscpp
to point to the second element ofcp
, which isc + 2
(pointing to “IIT”) and returns the pointerc + 2
. The- 1
operation moves it back toc + 1
, and when dereferenced, it gives the character pointer to “Placements”.*++cpp
incrementscpp
again to point to the third element ofcp
, which isc + 1
. When redereferenced twice, it gives the character pointer to “Placements”.*++cpp + 1
incrementscpp
to point to the fourth element ofcp
, which isc
. Adding1
moves it to the next string, which is “Placements”.
Question 6
What is the output of the following code?
#include <bits/stdc++.h>using namespace std;
int main() { char *str = "Welcome to C++"; double *p = (double *)str; p++; cout << (char *)p << "\n"; return 0;}
Answer
to C++
This code demonstrates pointer arithmetic and type casting in C++. Here’s a breakdown of what happens:
char *str = "Welcome to C++";
initializes a pointerstr
to a string literal. The pinterstr
points to the first character of the string “Welcome to C++”.double *p = (double *)str;
casts thechar *
pointer to adouble *
. This is a dangerous operation because it treats the memory address of the string as if it were pointing to adouble
, which typically has a size of 8 bytes on most systems.p++;
increments the pointerp
by the size of adouble
, which is usually 8 bytes. This meansp
now points to the memory address that is 8 bytes ahead of the original string’s starting address, and thus it points to the character ‘t’ in the string “Welcome to C++”.cout << (char *)p << "\n";
castsp
back to achar *
and prints the string starting from the new address. Sincep
now points to the character ‘t’, it prints “to C++”.
Question 7
Identify the bug in the following code, and give the minimal and the best way to fix it.
#include <bits/stdc++.h>using namespace std;
class A {public: int n; int *arr;
A(): n(0), arr(nullptr) {} A(int n): n(n), arr(new int[n]) {}};
class B {public: A a;
B() {} B(int n): a(n) {}}
int main() { B b(10); b.a.arr[0] = 1;
B b2 = b; b.a.arr[1] = 2;
cout << b2.arr[0] << " " << b2.arr[1] << "\n";}
Design Question
There is a module M
which needs to be provided with UDP packet data in an efficient and in order manner. The UDP packets are sequenced , , and so on. Since the data in from a UDP stream, it is possible that some of the packets are dropped in the transimission. You are required to implement a wrapper W
around the module M
, which can manage these drops and provide a reliable and in-order service to M
.
To recover the dropped packets, you can contact the main exchange to request a snapshot of the last recorded data it has. This exchange takes over a line that is seperate from the UDP connection. The exchange responds with a packet, which has a last packet number field called , and the data of all the packets from to sequence number in a compressed manner. The exchange updates the snapshot approximately every seconds, and increments the sequence number according.
The module M
has two methods:
process(packet)
- which processes a packet with the given sequence number.handleSnapshot(snapshot)
- which handles the snapshot data and processes all the packets from to
The exchange has a method:
getSnapshot()
- which returns the latest snapshot data.
You need to design the wrapper W
around the module M
(give the pseudo-code), which can handle the UDP packets and the snapshots efficiently. Processing the snapshot is an expensive operation, so it should be done only when necessary. You can assume that the wrapper W
can repeatedly poll the attached network interface to check for new UDP packets, and can also poll the exchange for the latest snapshot data.
Answer
class Wrapper { map<int, Packet> packets; // Store received packets that were not sent yet int maxSeenSequence = 0; // Track the highest sequence number seen int expectedPacket = 1; // Track the next expected packet sequence number
Module m; Exchange exchange;
void run() { // Start the recovery process in a new thread thread recoveryThread(&Wrapper::recovery, this);
while (true) { Packet packet = pollForUDPPacket();
// Process the packet immediately if it's the expected one if (packet.seq == expectedPacket) { m.process(packet); expectedPacket++; processBufferedPackets(); // Process any buffered packets that are now in order } else if (packet.seq > expectedPacket) { // Buffer the packet for later processing packets[packet.seq] = packet; maxSeenSequence = max(maxSeenSequence, packet.seq); } } }
void processBufferedPackets() { while (packets.find(expectedPacket) != packets.end()) { Packet nextPacket = packets[expectedPacket]; m.process(nextPacket); packets.erase(expectedPacket); expectedPacket++; } }
void recovery() { do { if (shouldRequestSnapshot()) { Snapshot snapshot = exchange.getSnapshot(); if (snapshot.lastSeq > maxSeenSequence) { // Update expectedPacket to the next sequence number after the snapshot m.handleSnapshot(snapshot); expectedPacket = snapshot.lastSeq + 1; processBufferedPackets(); } } } while (true); }
bool shouldRequestSnapshot() { // Implement a sort of exponential backoff // or fixed interval to request snapshots checking if the snapshot // was different from the last one }};
The code above implements a wrapper W
around the module M
that handles UDP packets and snapshots efficiently. The wrapper uses a map to buffer packets that arrive out of order, and it processes them in sequence as they become available. The recovery thread periodically checks for new snapshots from the exchange and processes a snapshot only when the snapshot contains the data for all the dropped packets till now. There would be a need for synchronization mechanisms (like mutexes) to ensure thread safety between the main thread and the recovery thread, especially when accessing shared data like packets
and expectedPacket
.
This problem alternatively can also be explained as a variant of the “Consumer-Producer” problem, where the main thread is the producer of packets (from the UDP stream) and the recovery thread is the consumer (which processes the snapshot data), and need to be synchronized properly to ensure that the packets along with the option to request a snapshot.
DSA Question
You are given a undirected graph with nodes and edges. You need to the colour each node with one of colours , or such that:
- For every edge , .
- The number of nodes of each colour is equal.
Return the number of ways to colour the graph, or if it is not possible to colour the graph. Return the answer modulo .
Answer
If the whole graph is connected, then it would be only possible to colour the graph if:
- The number of nodes is divisible by .
- The graph is bipartite, i.e., it can be coloured with two colours such that no two adjacent nodes have the same colour.
- In the bipartite graph, let the number of nodes in the first part be and in the second part be , such that . Then, it must hold that .
Then we can colour all the nodes in the smaller part with colour , and the nodes in the larger part with colours and alternatively.
Now since the graph can be disconnected, the conditions change a bit:
- The number of nodes in the whole graph must be divisible by (and not in each component).
- Each component must be bipartite.
- In the final colouring, there must be nodes from the “smaller” part, and nodes from the “larger” part. Thus the number of colourings can be calculated by choosing nodes from the “larger” part and colouring them with colour , and the rest nodes with colour . Thus the number of colourings would be .
- This would need to be multiplied by the number of ways to label each of the component parts as “smaller” or “larger” such that the sum condition holds. To calculate this, we can use dynamic programming to count the number of ways to choose the nodes from each component.
The overall time-complexity of the solution would be .
1#include <bits/stdc++.h>2using namespace std;3
4#define ll long long5#define vi vector<int>6#define vvi vector<vector<int>>7#define vvll vector<vector<ll>>8
9const ll MOD = 1e9 + 7;10
11ll power(ll a, ll b) {12 ll res = 1;13 while (b > 0) {14 if (b & 1)15 res = (res * a) % MOD;16 a = (a * a) % MOD;17 b >>= 1;18 }19 return res;20}21
22
23// Returns nCr(2a, a)24ll getBinomial(ll a) {25 ll facA = 1;26 ll fac2A = 1;27 for (ll i = 2; i <= 2 * a; i++) {28 fac2A = (fac2A * i) % MOD;29 if (i == a)30 facA = fac2A;31 }32
33 ll dr = power(facA, MOD - 2);34 dr = (dr * dr) % MOD;35 return (fac2A * dr) % MOD;36}37
38// Assumes that node u is already coloured, and we need to colour39// it's neighbours with the other two colours.40// Also pushes the coloured nodes to col1 and col2 vectors.41bool isBipartite(int u, vvi &g, vi &col, vi &col1, vi &col2) {42 for (int v : g[u]) {43 if (col[v] == -1) {44 // Colour the neighbour with the other colour45 col[v] = 1 - col[u];46 if (col[v] == 0) col1.push_back(v);47 else col2.push_back(v);48 if (!isBipartite(v, g, col, col1, col2)) return false;49 } else if (col[v] == col[u]) {50 // If the neighbour has the same colour, then it's not bipartite51 return false;52 }53 }54
55 return true;56}57
58ll getWays(int idx, int tar, vi &a, vi &b, vvll &dp) {59 if (idx == a.size())60 return tar == 0;61 if (tar < 0)62 return 0;63
64 if (dp[idx][tar] != -1)65 return dp[idx][tar];66
67 ll res = getWays(idx + 1, tar - a[idx], a, b, dp);68 res += getWays(idx + 1, tar - b[idx], a, b, dp);69 res %= MOD;70 dp[idx][tar] = res;71 return res;72}73
74int main() {75 int n, m;76 cin >> n >> m;77
78 vvi g(n);79 for (int i = 0; i < m; i++) {80 int u, v;81 cin >> u >> v;82 g[u - 1].push_back(v - 1);83 g[v - 1].push_back(u - 1);84 }85
86 if (n % 3 != 0) {87 cout << 0 << "\n"; // Not possible to colour88 return 0;89 }90
91 vector<int> col(n, -1); // -1 means uncoloured92 vector<int> a, b;93 for (int i = 0; i < n; i++) {94 if (col[i] != -1)95 continue;96 vector<int> col1, col2;97
98 col[i] = 0;99 col1.push_back(i);100 if (!isBipartite(i, g, col, col1, col2)) {101 cout << 0 << "\n"; // Not bipartite102 return 0;103 }104
105 a.push_back(col1.size());106 b.push_back(col2.size());107 }108
109 int s = n / 3;110 vvll dp(a.size(), vector<ll>(s + 1, -1));111 ll ways = getWays(0, s, a, b, dp);112
113 cout << (ways * getBinomial(s)) % MOD << "\n";114 return 0;115}
Descriptive Questions
-
What are virtual functions in C++? Why do we need them? Can static member functions be virtual?
-
What is 3-way handshake in TCP, and why is it necessary?
-
Why do we need both IP addresses and MAC addresses in networking? Since MAC addresses are unique to each device, why not just use them for routing?
-
If class
A
is a friend of classB
, and classC
is a derived class ofB
, then can you access the private members ofA
fromC
? (No) -
What is the use of the
inline
keyword in C++? What are the advantages and disadvantages of using it?Answer
In C++, the
inline
keyword is a request to the compiler to replace a function call with the actual function body at the point of the call. The primary goal of the same is to reduce function call overhead, which includes the time taken to push arguments onto the stack, jump to the function’s code, and return. Inline functions are most effective for small functions that are called frequently.The important fact to note that the
inline
keyword is a request, not a command. The compiler may choose not to inline a function based on various factors, like function size, complexity, or compiler settings. Excessive inlining can increase code size, potentially impacting performance due to increased instruction cache misses.inline
functions are often defined in header files because the compiler needs the function’s definition to perform inlining. Functions defined within a class definition are implicitly inline.Some functions cannot be inlined, such as recursive functions or functions with loops or switch statements. Since C++17, variables can also be declared inline, mainly to avoid multiple definitions in different translation units.
-
Is
malloc
a system call?Answer
malloc
is not a system call; it is a library function provided by the C standard library (or C++ standard library)libc
. It allocates memory from the heap and returns a pointer to the allocated memory block. The actual memory allocation is typically done using system calls likebrk
ormmap
, which are lower-level functions that interact directly with the operating system to manage memory.When you call
malloc
, it may internally use these system calls to request memory from the operating system, butmalloc
itself is not a system call. It is a higher-level abstraction that simplifies memory management for programmers.Examples of system calls include
read
,write
,open
,close
,fork
, andexec
. These system calls provide a direct interface to the operating system’s kernel, allowing programs to perform operations like file I/O, process management, and inter-process communication. -
Can function overloading be done based on the
const
qualifier?Answer
Yes, function overloading can be done based on the
const
qualifier in C++. This is particularly useful when you have member functions that behave differently depending on whether they are called on aconst
object or a non-const
object.For example, consider the following code:
1class MyClass {2public:3void display() {4std::cout << "Non-const display\n";5}67void display() const {8std::cout << "Const display\n";9}10};1112int main() {13MyClass obj;14const MyClass constObj;15obj.display(); // Calls non-const version16constObj.display(); // Calls const version1718return 0;19}You can also use
const
qualifiers in function parameters to overload functions. The same would only work for pointers or references, not for values (as values are always copied).1void process(int &x) {2std::cout << "Non-const reference: " << x << "\n";3}4void process(const int &x) {5std::cout << "Const reference: " << x << "\n";6}78int main() {9int a = 10;10const int b = 20;11process(a); // Calls non-const version12process(b); // Calls const version1314return 0;15} -
What is memory alignment in C++? Why is it necessary?
Answer
Memory alignment refers to placing data at memory addresses that are multiples of their size. Proper alignment is crucial for performance as it affects cache line usage, prevents CPU pipeline stalls, and is required for certain SIMD operations. Misaligned access can cause performance penalties or crashes on some architectures.
-
Explain the difference between paging and segmentation in detail. How do they relate to compilation and execution in C++?
Answer
Paging
Paging is a memory management technique used by operating systems where:
- The logical address space (used by processes) is divided into fixed-size blocks called pages.
- The physical memory (RAM) is divided into frames of the same size.
- Pages are mapped to frames using a page table maintained by the OS. Often, a Translation Lookaside Buffer (TLB) is also used to speed up address translation.
- It is related to the physical memory management, allowing processes to use more memory than physically available through swapping (as CPU works with virtual addresses).
- It eliminates external fragmentation.
Segmentation
Segmentation is another technique which:
- Divides memory into variable-sized segments based on logical divisions of a program.
- Each segment has a base address and a limit, which defines its size.
- It allows for more flexible memory management, as segments can grow or shrink independently.
- The various examples of segments of a program can include:
- Code segment (text)
- Data segment
- Stack segment
- Heap segment
- It is more abstract and logical, focusing on the structure of a program rather than physical memory management.
- It can lead to external fragmentation, as segments can vary in size.
Use in C++
Many modern OS (like Linux, Windows) use both in layered ways, but:
- Most modern x86 CPUs (in 64-bit mode) disable hardware segmentation.
- They instead rely on paging for memory protection and isolation.
In C++, you don’t interact directly with segmentation or paging. But:
-
C++ code maps to segments:
- .text → Code segment
- .data/.bss → Static/global segment
- heap → Managed via
new
/malloc
- stack → Function call stack
OS or executable format (like ELF) defines these segments during linking and loading.
-
C++ code is compiled to machine code, which is then loaded into memory by the OS. The OS uses paging to manage the memory of the process, mapping logical addresses used in the C++ code to physical addresses in RAM. You can also see it’s more explicit effects during page faults or memory mapped files.
-
What is the difference between
new
andmalloc
in C++? Discuss about the underlying memory management, intialization and metadata stored.Answer
-
malloc
is part of the C standard library, which is used to serve small allocations from an existing heap region. -
It can make use of system calls like
brk()
ormmap()
to allocate memory. -
malloc
implementations store a header before the actual memory block that contains metadata:- Size of the block
- Allocation flags
- Free/used state
- Possibly pointers to adjacent blocks (for free list)
[metadata][user data pointer returned by malloc]- On the other hand,
new
is a C++ operator that does more than just allocate memory. It:- Allocates memory for an object and then calls its constructor.
- It returns a typed pointer to the object, not just a
void*
. - It can throw exceptions (like
std::bad_alloc
) if allocation fails, whilemalloc
returnsNULL
.
- The
new
operator can be overloaded for each class to customize memory allocation. No extra metadata is stored by the language itself, but youroperator new
implementation can store info (e.g., for debugging). - The compiler and runtime rely on object layout and virtual tables (vtable pointers) for polymorphic behavior, not on heap metadata.
You can use the
strace
command to see the system calls made bymalloc
andnew
in a C++ program. For example, if you have a program calledyour_program
, you can run:Terminal window strace ./your_programYou’ll see calls like:
Terminal window brk(NULL) = 0x555555758000brk(0x555555779000) = 0x555555779000mmap(NULL, 4096, ...) = 0x7ffff7fb4000While
malloc
andfree
are used for raw memory allocation and deallocation,new
anddelete
are used for object-oriented memory management in C++.new
is not necessarily slower thanmalloc
, but it does additional work (like calling constructors and ensuring type safety), which can add overhead. -